## How predictive can April pitching stats be?

Last week, I crunched some numbers looking at the predictive ability of April hitting stats. By comparing April performances, PECOTA projections, and rest of season performances, I found that -- in general -- a hitter could be expected to retain 24% of the deviation from his projected K%, 17% for his BB%, 11% for BABIP, and 7% for ISO. I've repeated this exercise for pitchers, looking at K%, BB%, HR%, ERA, and FIP. Included in my sample is every pitcher who threw at least 20 innings in the month of April from 2011-2013. I did not set any innings thresholds for the rest of the season. This results in a few weird outliers, but ensures there isn't any selection bias caused by poor April performers dropping out of the sample.

Without further adieu, let's start with BB%. The results of a simple regression (with the intercept set to 0) yields an equation of:

(RoS BB% - PECOTA BB%) = .352 x (April B% - PECOTA BB%)

So on average, a pitcher can be expected to retain about 35% of the deviation from his projected walk rate. So if a pitcher is projected to walk 10% of opposing batters, but walks 20% in April, he can be expected to walk around 13.5% going forward. Here's what the data looks like with a smoothed loess trendline.

For strikeout rates, about 24% of the deviation is retained.

The number of home runs a pitcher allows in April tells us next to nothing about his future performance. Only around 6% of a pitcher's deviation from his projection carries forward through the remainder of the year.For ERA, around 14% of a pitcher's deviation carries over into May. But swapping out April ERA for April FIP causes this to jump to 24%.

What else is predictive?

To see if some un-predicted performances might be a little more believable than others, I ran some multiple linear regressions to see what characteristics make a player more (or less) likely to carry his performance through the rest of the year. Here are the results along with some notable takeaways:

ROS BB% - PECOTA BB%:

A pitcher's April walk rate is pretty decent predictor of his rest-of-season walk rate, but none of the other April stats give us any insight into how many batters he'll walk going forward.

ROS K% - PECOTA K%:

A pitcher's strikeout rate is fairly telling, but nothing else has any predictive value here.

ROS HR% - PECOTA HR%:

There's really no predicting a pitcher's long-ball tendencies. A pitcher's April HR% relative to his projection actually gets bumped from the model in favor of his FIP peformance, but even that tells us next to nothing. Regardless of what happens in April, pitcher's pre-season projection is easily the best predictor of his future ability to keep the ball in the park.

ROS ERA - PECOTA ERA:

A pitcher's April FIP can be predictive of his future ERA, but none of the other April stats were significant enough to add anything to the ROS ERA prediction.

Does Age Matter?

Like with hitting stats, there's little evidence that older or younger players are any more likely to sustain their April performances. Looking at the relationship between April performance and rest-of-season performance broken up by age, the slopes of the trend lines for ERA, BB%, and K% don't appear to be significantly different:

Just like with April hitting stats, April pitching performances can tell us something, but different stats have varying degrees of predictive value. A pitcher's April home run rate tells us next to noting, but his walk rate is actually pretty predictive. So if a pitcher is walking significantly more or fewer batters than expected, its fairly likely that he's set a new normal -- and his April performance will carry over through the rest of the year.