I did a post about this last night, and having received a useful tip on how to better study this (thanks metric!), I'm coming back to it and actually presenting some results. The basic idea here is that I was talking with my dad about hitting .400, and I was working with the assumption that a player is more likely to do it if he can consistently outhit his BABIP. That got me thinking about what sorts of players can outhit their BABIP.
Because strikeouts make BA lower than BABIP, and home runs make it higher, I figured I'd be looking for guys with lots of power who don't strike out much, which is a pretty rare combination. Sure enough, guys who fit this description like Pujols, Ted Williams, etc., show up. But, as you'll see if you check out the list I made, it also includes power guys who do strike out a lot, like Mark McGwire and Harmon Killebrew, among others. What's going on here?
Well, to try to figure this out, I did some back of the envelope math. If you define BABIP as (H-HR)/(AB-K-HR), then a little algebraic manipulation will show you that, to have BA > BABIP, the following must be true:
(1/BA)-1 > K/HR
(Note that Fangraphs BABIP actually isn't defined that way; it includes sac flies, but those don't make a huge difference and I'll ignore them for now).
This makes sense; you want to minimize the term on the right, which is done by reducing strikeouts and increasing home runs. However, don't ignore the term on the left. Increasing that term will also help you, and as you can see, the lower your batting average is, the higher the term on the left is. Paradoxically, the lower your batting average is, the easier it is to have a BA > BABIP.
Let's see the effect in action. Consider two St. Louis Cardinals first basemen: Mark McGwire and Albert Pujols. Big Mac had a .263 career batting average. Using the inequality above, the left side becomes (1/.263)-1 = 2.8. So, Mac needed to strikeout less than 2.8 times as often as he homered to outhit his BABIP. He struck out 1,596 times in his career, which is a copious amount, good for 29th all time. However, with 583 jacks, that is only 2.73 as many times as he homered, which allowed him to outhit his BABIP. Pujols, on the other hand, has a .334 career average (as of this writing, according to Fangraphs), which means he can only strikeout (1/.334)-1=1.99 times as often as he homers to outhit his BABIP. Impressively, he's done this, but clearly he has to make better contact to outhit his BABIP than McGwire did.
Because of this, the list seems to split into two main types of players: the Killebrew and McGwire types, who make the list through a combination of huge power and a ridiculously low batting average; and the Williams and Pujols types, who make the list with the rare combination of contact ability and power.
Here are the other interesting things I noticed about the list:
- I count 99 players with at least 1000 PAs in MLB history who have outhit their BABIP. Note, however, that Fangraphs doesn't seem to have BABIP data for several seasons of players from the late-19th/early 20th centuries, so I had to purge these players; this purge removed players like Honus Wagner, Nap Lajoie, and Joe Jackson, among others, so it's possible the 99 is an undercount.
- These players are almost all very good offensive players; their average wRC+ is 129.5. This is unsurprising, considering that having power and not striking out much (or striking out frequently but having tons of power) should tend to make one a good hitter. There are only 6 players on the list with a wRC+ below 100; this is mostly players with low averages, middling power and very few strikeouts, although there's also Don Mueller, with a decent average and virtually no strikeouts, and Darrin Fletcher, who seems to have had good power but just couldn't draw a walk.
- As noted above, typically players who strike out a lot make the list by having a low batting average. That said, to every rule there's an exception, and although Babe Ruth has the 5th-highest K% on the list, he hit .342 for his career, the 3rd-highest average on the list. He's the only one in the top 10 of K% with a .300 average for his career, although Barry Bonds and Albert Belle come close, with .298 and .295 average, respectively. Babe Ruth was really good.
- I said earlier than the fact that Fangraphs includes sac flies in their calculation of BABIP and I don't doesn't make a big difference. One notable difference it does make here is that Carlos Lee makes the list, but if you calculate BABIP as I did, he actually shouldn't; his 55 career sac flies lower his BABIP down to the level of his BA. So, another corollary to this is that a player who has an above average number of sac flies for some reason (flyball hitter on a team with a high OBP?) will also be capable of showing up on this list, but I imagine the effect is mostly going to be dominated by home runs and strikeouts. EDIT: after I wrote this, I had a hunch, and looked up Graig Nettles; Fangraphs doesn't have batted ball data for him, but I wouldn't be surprised if he were a flyball hitter, between his home run totals and low BABIP, and he played on late-70s early-80s Yankees teams with plenty of baserunners, and sure enough, he's another one for whom the sac flies push his BABIP below his BA.