clock menu more-arrow no yes mobile

Filed under:

Regress, Regress I Say! (Complete with In-Season Batter Regression Tool)


(Click here to enlarge)

(Thanks to J-Doug and commenter hairball for helping me find an error in the calculator. Data and post have been updated accordingly)

So much is driven early on by small sample sizes that it can be easy to get overly excited about a batters hot or cold start. After only about 15-20 games, all sorts of weird stat lines can pop up. Maybe a player's performance is a reflection of a change in approach, leading to a change in performance (e.g. Jose Bautista). Or maybe it's just a reflection of random chance.

To gain some perspective on the early returns this season I decided to regress all batters with >=40 plate appearances by their 3-year average batting average on balls in play (BABIP) and their 3-year average HR/FB rate. The chart above presents the top-15 and bottom-15 batters in terms of the difference between their actual wOBA this year and what we would expect given their 3-year averages in BABIP and HR/FB.

Not surprisingly, we see Russell Martin high up on the list in terms of beneficial difference with a wOBA .148 points higher than his regressed wOBA. Pat Burrell is off to a .391 start, but if we regress him he may be more of a .290 wOBA hitter given his BABIP and HR/FB rates.

(More analysis and calculator below)

Brad Hawpe, who has been off to just a miserable start, comes out with a respectable .321 wOBA--a difference of -.169 (second largest negative difference of all players). Hamley Ramirez comes in fourth in terms of least beneficial difference with an expected .399 wOBA (-.142 difference).

Now, just because regressing a batter based on their 3-year averages show them to be very high or low relative to what they've shown so far we can't just assume it's luck. For example, Pablo Sandoval should technically be in the .319 wOBA range, but we all know that Pablo this year is drastically different from a year ago. Will he end the year with a .400 wOBA? That I can't say, but we have to interpret the data in context and it's pretty likely he'll be a better than league-average hitter by the end (say, .340-.360).

I put together an In-Season Batter Regression Calculator for all to use throughout the season (you can find it here). It will likely be more useful early on until player performance stabilizes, but it will work at any point in the season.


(Click to enlarge)

Simply type in the first and last name of the player (all batters with >=40 plate appearances where I have 3-year average data are included) as well as a few bits of data (basically FanGraphs standard dashboard plus FB%). The calculator will compute their adjusted actual wOBA (just using the coefficients I used for consistency's sake) as well as what we would expect based on BABIP and HR/FB over the past three seasons.

The tool simply takes what a hitter has done to this point and recalculates their performance based on how many non-HR's they should have (based on 3YR BABIP) as well as home runs based on 3YR average HR/FB ratios. It doesn't necessarily predict where a player will finish the year, but it does give some sense of how much a player's performance is out of whack relative to those averages.

I'll be doing more with this tool in later posts. For now, let me know what you think and any suggested tweaks.