clock menu more-arrow no yes

Filed under:

The Relative Win Value of Preventing Hits on Balls in Play: A Paper by Cyril Morong

New, 9 comments

Cyril Morong agreed to help out with some projects BtB has coming up in the future, as well as contribute to the site itself on occasion. His first piece will be posted today. Enjoy!

In 1999 Voros McCracken put forth the idea that pitchers don't differ very much in the batting average (AVG) they allow on balls in play (BIP) and that what really matters is their walks, strikeouts and homeruns. But given that about 70% of plate appearances end up as a ball in play, even a small difference in the AVG allowed on BIP between pitchers could mean many more hits allowed (or not allowed). Since hits have run values (which in turn have a win value), the important question is how many extra wins (or losses) a pitcher gets due to his AVG on BIP. Also important is how this compares to the wins and losses he generates with his walks, strikeouts and homeruns. I probably got the idea for this analysis from reading the work of Mike Emeigh and Tom Tippett. Any mistakes or flaws in the analysis are due to me, of course.

To determine this, I looked at the top 100 in batters faced by pitcher (BFP) from 1994-2004 and obtained their relevant data from The Lee Sinins Sabermetric Encyclopedia. Then I found how much each pitcher differed from the "replacement level" pitcher in walks, strikeouts and homeruns and hits allowed on BIP.1 These differences were converted to run values and then win values.

To come up with values in walks, strikeouts and homeruns and hits allowed on BIP for the replacement level pitcher, I found the 500 pitchers with the fewest BFP from 1994-2004. Since low quality pitchers don't get used much, this is a reasonable way to approximate the replacement level pitcher.2 Here are the rates (per BFP or BIP) for all of these pitchers combined:

BIP AVG = .321
HR = .0342
SO = .141
BB = .138

For the top 100 pitchers

BIP AVG = .293
HR = .0268
SO = .172
BB = .087

(I included HBP in BBs).

The next step was to determine how many more (or fewer) each pitcher had than the replacement level. Then convert that to a run value and then a win value.

The run value of the following events was:

HRs = 1.4
BBs = .33
Ks = -.2
Hits on BIP = .55
The first two are the linear weights run values used by Pete Palmer, editor of the Baseball Encyclopedia. The -.2 for strikeouts comes from various estimates I have seen elsewhere on the internet.3 The .55 for hits on BIP is a weighted average of the following linear weights values:

1B = .47
2B = .78
3B = 1.09

They are weighted by their relative frequency in the 1994-2004 period.

Once I found how many more (or fewer) of each stat each pitcher had than the replacement level pitcher, I converted that into a run value and then a win value for a typical full season for a starting pitcher (225 innings pitched (IP) or 25 complete games (CG)).

Take Pedro Martinez, for example. He had 1,306 more SOs than the replacement level pitcher would have had with the same BFP. That has a run value of -261.2 (which is 1,306*-.2), meaning he saved that many runs by striking out more batters. I then divided that by the number of batters he faced (8,714). That result (.0299) was then multipled by 40 to get how many runs he would save in a typical 9 inning game by his high strikeout total. This was about 1.199. In a season of 25 CGs, it would be about 29.975 runs. Using an estimate that it takes about 10 runs over the course of a season to win one additional game for a team, I estimate that Martinez adds 3.00 wins a year from his strikeouts (since 29.975/10 = 2.9975).

The same thing was done for BBs and HRs. For BIP, I assumed that the replacement AVG on BIP was .321. Then I calculated each pitcher's AVG on BIP, found the difference from .321, multiplied that by each guy's BIP to get how many more (or fewer) hits they allowed on BIP. This ultimately converted to win value the same way the other stats were. For example, Martinez had 5,350 BIP and a BIP AVG of .279. His difference from the replacement level was .042. Multiplying that by 5,350 leaves 224.7. So he gave up 224.7 fewer hits on BIP than the replacement pitcher would have. Multiplying that by a run value of .55 for BIP leaves a runs saved total of 122.84. That gets divided by Martinez's total BFP, then multiplied by 40 and so on, just like in the previous paragraph. That means about 1.41 wins a season.

Then, for each pitcher, I added up their wins from each of the four stats, walks, strikeouts and homeruns as well as how many hits they allowed on BIP. That left a total number of wins above replacement level. The ranking is given below. Pedro Martinez was 8.48 wins above the replacement level.4 The BIP Share shows the percentage of a pitcher's value that comes from preventing hits on balls in play. Martinez had 1.41 BIP Wins, which was 16.6% of his total of 8.48. The last column shows each pitcher's BIP AVG, the average they allowed on balls in play.

For these 100 pitchers, collectively, about 25.1% of their value comes from the lower BIP AVG than the replacement level pitchers. Here is the share that comes from all four stats:

BB 37.6%
HR 23.6
SO 13.7
BIP 25.1

So it appears that quite a high share of a pitcher's value comes from preventing hits on balls in play. The top 10 shares from preventing hits on BIP were

Pitcher BIP Share
Rick Helling 0.562
Brian Anderson 0.555
Tim Belcher 0.498
Tim Wakefield 0.448
Eric Milton 0.430
Woody Williams 0.418
Jamie Moyer 0.411
Pat Hentgen 0.408
James Baldwin 0.406
Kirk Rueter 0.405

The 10 lowest BIP AVGs were

Pitcher BIP AVG
Jamie Moyer 0.273
Woody Williams 0.274
Russ Ortiz 0.275
Tim Wakefield 0.276
Kerry Wood 0.277
Eric Milton 0.277
Chan Ho Park 0.277
Greg Maddux 0.279
Tim Hudson 0.279
Freddy Garcia 0.279

Now some pitchers may have played on teams that had above average fielders and this allowed them to prevent hits on balls in play (not their inherent ability to do so). How much difference would something like this make? Let's take Jamie Moyer who had the lowest BIP AVG at .273 and got 41.1% of his value from preventing hits on BIP. He played a significant part of this period on the Mariners. Altough the Mariners have a reputation for being a great defensive team, their BIP AVG from 1997-2004 was just .006 below the major league average for the period (.289 vs. .295). If we raised Moyer's BIP AVG to .279 (an additional .006), he would have allowed 1,854 on BIP (since 6,646*.279 = 1,854). If I calculated the number of wins above the replacement level this would have given him (using the method described above), he would have had 1.75 BIP wins.

This would still give him a high BIP Share. He initially had 4.87 total wins (see his ranking at #26 above). But if he only had 1.75 BIP wins instead of 2.00, his total would fall to 4.62. But 1.75 BIP wins is still 37% of his total value. And if I assumed he gave up hits on BIP at the major league rate of .295, he would have had 1.09. That would drop his total to 3.96 wins. The 1.09 BIP wins is still 27.5% of his overall value.

A similar story could be told about other pitchers on the list. Woody Williams probably did not get enough help from his fielders to lower his BIP Share to an insignificant level since he pitched for 3 different teams over the period. Maddux and Glavine both pitched mainly for the Braves, who had a BIP AVG of .285 from 1994-2003. Adding 10 points to their individual BIP AVGs would certainly lower their BIP Shares, but like with Moyer, it would still be significant. Chan Ho Park pitched mainly with the Dodgers who had a good BIP AVG of .285. But that is not going to be low enough to make his BIP Share insignificant. For Kerry Wood, the Cubs had average fielding with a .296 BIP AVG. Tim Wakefield is up there, but it is generally acknowledged that knuckelball pitchers do well in BIP AVG.

Concerns and Conclusions

I did not take park effects into account. But not doing so is probably not going to lower the BIP Shares of very many pitchers to an insignificant level. I checked Rick Helling and I lowered his HRs allowed by 10% and he still had a BIP Share over 40%. But some pitchers would see their BIP Share rise if I took park effects into account. Remember, that collectively, 25% of value came from preventing hits on BIP. These pitchers probably pitched in all parks, so collectively they pitched in an average park anyway.

There may be some correlation between the stats. For example, if SO Wins and BIP Wins were highly correlated, then we might not be sure as to what the BIP Share really is. But the correlation between these two was only .068. And the correlation between BIP Share and the other two were both under .15. I also ran a regression in which BIP Wins was the dependent variable and the other 3 were the independent variables. The r-squared was about .03, meaning that very little of the variation in BIP Wins is explained by the other three acting together. I thought that maybe a high-quality pitcher would do well in the other three stats and that would naturally spill over in to BIP wins. Apparently not. Also, the highest coefficient value on any variable in that regression was .08 (on HRs). So if HR Wins went up 1, BIP Wins went up .08. Not much of an effect. In fact, the highest correlation was between HR Wins and SO Wins (.316). So probably one (or both of those) should be lower than what I have estimated. Doing that would end up raising the BIP Share for most pitchers.

I assumed a 40 plate appearance game for each pitcher. That is a little too high for the 100 here, who averaged 38.5 per 9 IP. The replacement level pitchers would certainly be above that, since they would allow more hitters to reach base. But if I took this into account, it would just raise the win value of all 4 stats and probably would not change the BIP Shares very much.

I was surprised to find that for most pitchers that preventing hits on BIP was a significant part of their value, based on what I understand about McCracken's work.

Now the range of wins is lowest for BIP. The highest was 2.00 and the lowest was -.08, for a range of 2.08. The range for the other 3 was

SO Wins 4.50
HR Wins 3.39
BB Wins 2.97

This seems to be consistent with McCracken, that pitchers don't differ as much in preventing hits on BIP as much as they do in other things. But a range of 2.08 may still be significant. The other stats also vary more. Here are the standard deviations in Wins for each stat:

BIP Wins does not vary as much, but it varies none the less and is not really that far behind the other stats.

It is possible that being able to prevent hits on BIP (independent of other abilities) may be significant for many pitchers.

End notes 1. I compared to the replacement level pitcher instead of the average pitcher because what I am after is what percent of a pitcher's value comes from preventing hits on balls in play. If I compared to average, I might get more than 100% of a pitcher's value coming from preventing hits on BIP. A pitcher might have a total value of 0.5 wins (after adding up his value from all four stats, some of which might be negative) but be 1 win better than average in preventing hits on BIP. That would mean 200% of his value would come from this source. Since replacment level pitchers are generally well below average, it is very unlikely that more than 100% of a pitcher's value will come from preventing hits on BIP.

2. Actually, not all of the bottom 500 were used. There were many position players in this group, who pitched occassionally in blow out games (even Wade Boggs was one). So I eliminated any pitcher who had fewer than 20 BFP. This way only legitimate pitchers were included.

3. Actually, using a weighted-average of the values of all non-strikeout events from the 1999 Big Bad Baseball Annual, a strikeout saves .17 runs. That is, if a pitcher does not strikeout a batter, the average value of the even that actually does occurr would be .17.

4. If we assume the replacement level pitcher would have a winning percentage of .333, that would be 8.33 wins (out of 25 decisions). If Pedro Martinez had 8.48 more than that, he would have 16.81 wins for a pct of .672. That is a reasonable estimate. It is a little below his actual pct of .705. If we wanted him to get that estimate, it would mean a slightly higher pct for the replacement level.

If you would like to see more of Cyril's work, visit the link on his name at the top of the article, or check out his DH sting at The Baseball Analysts. I'm excited to have Cyril on as part of the BtB crew, so make sure you check out his stuff.

I'll resume regular posting tomorrow, but I only had time to either write my own article or edit/post this one before work, so today is just this.