/cdn.vox-cdn.com/photo_images/6825675/20120506_jla_sj7_216.jpg)
One of the most basic axioms of sabermetrics is that batting average isn’t the definitive offensive statistic, and that it had been misused as such for decades. It is highly unsuitable for that sort of catch-all role, as it does not account for walks or extra-base hits, and it uses at-bats as its denominator instead of plate appearances, just to name a few of the most glaring quibbles.
More and more people are recognizing that it’s possible to be an effective offensive player while hitting .240, or to be an ineffective one while hitting .295. However, batting average is still commonly taken as a proper measure of contact ability, or the "hit tool," and I would dispute its suitability for that role as well. To give an extreme example, Jai Miller struck out 37.7% of the time in Triple-A last year, yet he hit .263. The strikeout rate indicates his contact skills are almost nonexistent, whereas the batting average paints him as a middle-of-the-road contact guy. Mike Napoli hit .320 last year; his teammate Ian Kinsler hit .255. Does anyone think Napoli is the superior contact hitter?
None of this should be news to you—after all, you’re reading Beyond the Boxscore! But with the mandatory average-bashing out of the way, I want to take a look at what batting average can tell us.
First off, it should be stated that batting average, like all the other stats sabermetricians love to hate (W-L, RBI, SV, even ERA), does not mean nothing. If you hit for a high average, chances are you’re a good hitter, and if you hit for a low average, chances are you’re a poor hitter. Likewise, there is no doubt a positive correlation between wins for starting pitchers and their underlying performance (FIP, SIERA, or whatever else you care to use).
The rub, of course, comes from the fact that we simply have better numbers to look at. On-base percentage is better than batting average at measuring the frequency of successful plate appearances, strikeout rate is better at measuring the ability to make contact, and stats like wOBA and wRC+ are much better at gauging a hitter’s overall effectiveness. With those stats readily available, there’s rarely much of a need to cite batting average.
The other problem, though, is that most people conceptualize batting average in this way:
AVG = H/AB
Both the numerator and the denominator here are flawed. A hit can be anything—a slow roller out in front of the plate, a blooper, a screaming liner, or a towering homer. At-bats exclude walks, hit-by-pitches, sacrifice bunts, and sacrifice flies (Quick aside: Shouldn’t a fly ball that advances a runner from first to second or second to third also not be an at-bat, if one that scores a runner from third isn’t?). Since there’s no one skill that exemplifies the ability to get hits and there’s no accounting for these various other outcomes, it’s easy to write batting average off as having little sabermetric value.
You’re probably thinking "Wait, Nathaniel, what other way to look at batting average is there, besides hits divided by at-bats?" How about this…
AVG = (HR + (BABIP * (AB - HR - K)) / AB
This still has the denominator issue, and there’s obviously nothing we can do about that, but the numerator here is kind of intriguing, no? We have home run ability, batting average on balls in play ability, and contact ability all rolled into one stat. BABIP ability, of course, is itself a complex mixture of speed, line-drive ability, and luck.
If you always think of batting average in this fashion, it becomes much more useful, as a measure of the various things a batter is able to do when he swings (with the exception of strikeouts looking, which still factor in, and sac flies, which don’t). He can hit it out of the park, put it in play, or strike out.
It’s useful to keep this more complicated expression of batting average in mind when projecting players, in particular. Take, for example, Boston rookie third baseman Will Middlebrooks. In his minor league career, he struck out 26.3% of the time while walking just 7.5%, and the K/BB ratio tended to get worse as he advanced. Entering the 2012 season, I thought it was fair to look at Middlebrooks as a guy who would put up around a 5% walk rate in the majors to go with a 22.5% strikeout rate and 20 HR/600 PA. So, if he received 600 major league plate appearances, I would’ve done something like this.
AVG = (20 HR + (.315 BABIP * (560 AB - 20 HR - 135 K)) / 560 AB
That’s a batting average of .264. Add in the 5% walk rate, and you get an OBP just under .300. It’s tough to be a star with those numbers without hitting absurd numbers of homers and playing absolutely exceptional defense at third base.
Middlebrooks went on an early-season tear in Triple-A, though, ripping nine homers and striking out just 18% of the time while walking 7%. Perhaps we should increase his "ceiling" in the majors to 30 homers and a 20% K rate, which would give him this (plug in whatever BABIP you want, of course; I chose .315 as a solidly optimistic number):
AVG = (30 HR + (.315 BABIP * (560 AB - 30 HR - 120 K)) / 560 AB
That’s a .284 average, which would make a low walk rate far more palatable. And, of course, an increase of 10 HR/600 PA provides a pretty dramatic slugging boost as well. That’s a hidden bit of value home runs have—they boost power production, but they also are a means to get hits/get on base without having to deal with the vagaries of balls in play.
For another case study of sorts, let’s look at Billy Hamilton, the fastest man in baseball. Hamilton is currently hitting .382 in High-A. But Hamilton has just one home run in 118 PA, and his strikeout rate of 15.3% is only slightly above average. How is he hitting for such a stratospheric average? Blazing speed, a friendly environment, and some luck—in other words, BABIP. Hamilton’s BABIP right now is .452.
As I’ve broken down here, here, and here, it’s far, far easier for hitters to have extremely high BABIPs in the minors than in the majors, for a variety of reasons. For prospects, then, having a BABIP-dependent average is a bit of a red flag, because you know that BABIP will come down in the big leagues. Hamilton might be a bit of an exception due to his generational speed, but note that even Ichiro, a speed demon and master of BABIP himself, had a .351 career mark entering the season.
This component-based look at batting average, when taken in conjunction with the above thoughts on minors-to-majors BABIP translations, also raises questions about someone like Nick Castellanos. In his 169-game career, the Tigers’ consensus top position player prospect has hit for a robust .330 average. However, he has just nine career home runs, 155 strikeouts, and 58 walks. The average, therefore, comes mostly on the back of a .434 career BABIP. Unlike Hamilton, Castellanos lacks blazing speed or the sort of great contact ability (see: the K rate) that would portend obvious BABIP success, though his high doubles totals do point to some line drive prowess. Let’s be kind and project him for a .320 BABIP in the majors.
AVG = (9 HR + (.320 BABIP * (640 AB - 9 HR - 155 K) / 640 AB
It comes out to .252. And that’s with the high BABIP projection and assuming that his strikeout rate doesn’t erode at all on his way up the ladder from A-ball.
That’s not to say Nick Castellanos is a bad prospect, of course. He’s 20 years old and has showed good skills for his age, and he projects to improve in several areas moving forward. But this more in-depth look at batting average reveals that it may be incorrect to jump to conclusions about his ability to hit for average in the future. He will need to develop more over-the-fence power or get a better handle on the strike zone if he is to become an impact player, because the skills he has relied on to succeed to this point will significantly erode against more advanced pitchers. One could voice similar concerns about BABIP-dependent prospects like Starling Marte of the Pirates and Brett Jackson of the Cubs, though at least both of them have succeeded in the upper minors and show off more present power and speed, not to mention their defensive value.
These are just a few of the examples where batting average becomes much more interesting when broken down into its components, rather than simply being treated as H/AB. Of course, merely examining the components themselves is almost as good, but batting average does do a tidy job of combining them into one piece. If you squint hard enough, and you understand the limitations, it can give some solid perspective.