clock menu more-arrow no yes mobile

Filed under:

Is BABIP simply sheer luck?

Batting average on ball in play (BABIP) is one of baseball's greatest mysteries. It has been shown that pitching skill and defense have some effect on BABiP, but their effect pales in comparison to that of sheer luck.

Thomas B. Shea

In my opinion, the hardest thing in baseball to explain is one year of a pitcher's batting average against on balls in play. Yes, explaining what happened in Game One of the 2012 World Series is less difficult.

According to the FanGraphs sabermetric library, the factors that affect BABIP are:

  1. Defense
  2. Luck
  3. Talent Level

Steve Slowinski, the keeper of the aforementioned library, goes on to make this point:

If a pitcher has a very high or very low BABIP, it means that whatever the reason for the spike (whether it’s defense, luck, or slight skill changes), that player will regress back toward their career BABIP rate. BABIP rates are flaky and prone to vary wildly from year to year, so we should always take any extreme BABIP rates with a grain of salt.

For most, the question of how much of an effect a pitcher's skill has on BABIP, or how much the defense behind them affects the statistic is still up for debate.

Matt Swartz showed, a few years ago, that for starting pitchers who throw a lot of innings, the breakdown of effect from these three factors is as such:

  1. 12 percent is pitching skill or talent level
  2. 13 percent is team defense
  3. The other 75 percent is pure luck

These results were surprising to me, because while I've always erred on the side of pitching skill having some effect on BABIP, I always figured that team defense had a much greater effect than skill.

My only theory to explain that would be the more innings a pitcher racks up the better pitching skill gets at explaining BABIP.

BABIP has always been one of my favorite topics of research, because it is literally impossible to predict. It takes eight seasons, eight seasons!!, for the correlation between past and future BABIP to reach "true talent level" or a correlation of .5; that's incredible.

So, I wanted to tests these results for myself; however, I decided to expand Swartz' sample and make the test slightly simpler.

One of the major findings of BABIP studies has been the negative relationship between strikeout percentage and BABIP, this dates all the way back to Voros McCracken's DIPS 2.0. Essentially, pitchers who strikeout more hitters tend to yield fewer hits on balls in play (Higher K%, Lower BABIP). There have been other studies done that deal with groundball vs. flyball rates, walk percentages, home runs and more, but for this piece I'm just going to focus on strikeouts.

I also wanted to test how team defense affects BABIP. I stayed away from defensive efficiency, because that statistic is approximately (1-BABIP), so I'd expect the correlation to be very high. I instead used Ultimate Zone Rating (UZR) scaled to 150 games for each team.

The sample Swartz used for his original study was for starters who threw at least 150 IP in a single season from 2003-2009.

My sample included any pitcher who threw 20 IP in a season from 2003-2012 (n=4214).

Swartz's sample obviously will have a better fit, as the difference between a minimum IP of 150 and 20 is massive, but I wanted to include everyone just to see what the results would be.

The Results:

K% vs. BABIP:


UZR/150 vs. BABIP:


For both linear regressions, we expected to see a negative relationship; higher strikeout percentages were supposed to lead to lower BABIPs, while higher team UZR/150s were supposed to lead to lower BABIPs.

Both regressions did result in a statistically significant negative relationship.

I also tested to see how well random numbers would correlate to BABIP, just to make sure we would not randomly find a negative correlation. This test resulted in no relationship.

Interestingly, as indicated by the r-squareds, strikeout rates had a greater negative effect on BABIP than team defense for this sample. Strikeout percentage explained 3.89 percent of the variance in BABIP, while team UZR/150 only explained 1.80 percent of the variance in BABIP.

This result was really surprising for me personally, because as I stated earlier, I figured that defense had a much larger effect on BABIP than actual pitcher skill, in this case the ability to strike batters out.

There is one major issue with this finding, though.

The assumption for this test is that a defense plays exactly the same night in and night out behind each individual pitcher; that obviously is not true.

There was a thread over at the Book Blog a few months ago, in which Tom Tango and MGL discussed the different UZRs that the Tigers defense posted behind individual starters (Justin Verlander, Max Scherzer, Rick Porcello, etc.)

Tango sums up this interesting paradox of sorts, much better than I could with this quote:

So, if you are trying to use a team’s overall fielding as an INDICATOR as to how much fielding support an individual pitcher received: don’t! It’s very well possible that if you see a pitcher with a low BABIP on a bad-fielding team that he could have still gotten GOOD fielding behind him. Just as you wouldn’t presume a good hitting team provides good run support to all its pitchers, or a good bullpen helps out all of its pitchers, then neither should you presume that having a good set of fielders behind you means that ended up receiving good fielding support.

This statement could be a good indication as to why we are seeing such a weak relationship between overall team defense and individual pitcher BABIPs.

The problem is how much can we trust the team defensive numbers put up behind individual pitchers?

Does a team actually play better defense behind their ace? Or does their ace induce balls in play that are easier to field?

I honestly don't think we have the answer to that question.

Finally, I ran a multiple regression to see how these two statistics fared at explaining BABIP together.

They combined to explain 5.53 percent of the variance in BABIP; which leaves lot to be desired (almost 95 percent of the variance, as opposed to 75 percent Swartz found).

While these two statistics don't reach Swartz's 25 percent number, I still think the results are interesting. I did not test everything that could have an effect on BABIP outside of luck, yet simply using strikeouts and team UZR for this study seem to be evidence to back the conclusion that not all of BABIP is sheer luck.

All statistics come courtesy of FanGraphs.

You can follow Glenn on twitter @Glenn_DuPaul