clock menu more-arrow no yes

Filed under:

Replacement Players, Expectations, and Variance

New, 3 comments

Every year a couple players break out from obscurity to top of the leader-boards, but how often should we expect this to happen from simple random variation?

Leon Halip

This Friday when the Orioles take on the Royals in the ALCS one of the critical players of the series will be Steve Pearce. Prior to this year Pearce was a career replacement player. What can we say about mid-career renaissances like Pearce's? Motivated by JD Martinez's breakout I wrote a piece earlier this summer at Batting Leadoff about how these presumed replacement players actually performed. I wanted to see how often that type of season comes out of replacement players just by random variation. I found that while replacement players averaged a WAR around 0, they had a standard deviation of 0.7. Since replacement player WAR has a normal distribution, unlike WAR among the entire MLB population, about 2.5% of replacement players would be worth 1.4 wins above replacement just from randomness.

While this was a neat finding, the study was flawed for a few of reasons. The first is that I weighted by player and not by plate appearance. Phil Birnbaum, Tom Tango, and Mitchel Lichtman went into the complications this causes here, but it has to do with the fact that players who play well get more playing time and players who don’t get demoted or benched. These players with unequal playing time should not be counted equally.

The second complication is that because WAR is a counting stat, lots of players will cluster around 0 for the simple fact that they did not have enough at bats to accumulate significant WAR in the positive or negative direction. A third complication is that WAR is not an offense only stat, since I am really only concerned with offensive output using wOBA makes a lot more sense.

Finally because the spread of WAR among all players is not normal, I never compared the variance among regular players to the variance among replacement players. Because 2014 stats are now set in stone, and for the reasons above I feel the question was worth revisiting.

To build my data set I pulled the 2014 ZiPS projections from The Baseball Projection Project along with the 2014 actual data from FanGraphs. This time around my statistic of interest was wOBA, not WAR which means I had to first condition the data. First I had to define what replacement level was. I used the FanGraphs Glossary rule of thumb to say that a .300 wOBA, defined as "poor", would be suitable for replacement level.

Next I had to run some crude positional adjustments to avoid biases. Because of the way I assigned positions to players DHs are thrown into whatever position they played the most. If a player did not appear at all in the field they were excluded from my sample. Using the mid-season positional wOBA averages I came up with the following replacement level wOBA for each position.



















From here I divided a players wOBA by replacement level and multiplied by .300 to get a position-adjusted wOBA. Just like I couldn’t include only players projected with exactly 0 WAR I cannot include only players with exactly replacement wOBA. I defined a replacement as a player projected to have a position-adjusted wOBA within 5% of the replacement position-adjusted version of the wOBA levels given above. This gave me a nice sample of 225 players, about half of the players in our sample.

Now that we have a defined sample we can start looking at the data. Below are the summary statistics for plate appearances and position-adjusted wOBA (weighted by plate appearances) as well as a scatter plot of position-adjusted wOBA versus plate appearances.

Among Replacement Players



Std. Dev.









wOBA (adjusted)







A scatter plot is always a good place to start when looking at data. There is an obvious relationship between variance in wOBA and plate appearances. Players with few plate appearances’ wOBA are subject to drastic swings. Additionally, players with below replacement level wOBA see cuts in playing time after playing poorly for what ever their manager deems long enough. After about 200 plate appearances wOBA becomes somewhat more predictable. The center is around .300, but there is still wide variation between .200 and .400 indicating that player performance is still pretty random. We can gain more valuable insight by also examining the distribution of replacement player’s wOBA.

wOBA Hist

This histogram is weighted by plate appearance just like our summary statistics above. The distribution appears to be approximately normal which gives our standard deviation of .036 a stronger interpretation. Because we have a normal distribution we can use the 68-95-99 rule which tells us that 68% of our data will fall within one standard deviation of the mean.

In this case that means 68% of replacement players will have a wOBA approximately between .264 and .336; quite a large separation. Further, about 5% of replacement players will fall more than two standard deviations from the mean. That means 5% of our sample, about 10 players, will have a wOBA less than .228 or more than .372. The 10 below .228 we often never hear about and don’t make it through the season. The 10 above .372, well, those are the guys that we like to build narratives about.

The problem with using a rate stat is that there will always be a lot of variability simply because wOBA takes more than 257 plate appearances to stabilize. This leads us to the question "Is the standard deviation of .036 due to the fact that replacement players are by nature more random entities or is it just because they have fewer plate appearances?" To answer this question I compared the standard deviation of position-adjusted wOBA among replacement players to another group of players. We cannot simply compare the standard deviation to players that are not replacement because we expect non-replacement players to vary. Instead we should pick another arbitrary group that are projected to hit around a fixed wOBA just like replacement players. I chose the arbitrary group of players within 5% of the calculated replacement level plus 40 wOBA points.

The group I have chosen had a weighted standard deviation of .027, a solid 25% lower than the .036 among replacement players. There still could be issues associated with the fact that this other group of players averaged significantly more plate appearances than our replacement players (about 510 instead of 257). As another robustness check I calculated the standard deviations again while restricting the sample to players with between 300 and 500 at bats. This way both groups had roughly the same average number of plate appearances. Again the result holds with replacement players having a weighted standard deviation of .33 while the other group had a weighted standard deviation of .26.

To summarize yes, a lot of the variance in the performance among replacement players is due to the fact that they as a group have fewer plate appearances than above replacement players. However, by weighting by PA and taking selective samples we can check this to find that indeed replacement players do have a wider spread in performance than non-replacement players.

There are two sources of variation, the variation due to small samples and the variation due to the players themselves. It is hard to separate the two and I am not claiming that I have done so. What I have done is show that there is evidence that replacement player performance has a higher degree of randomness due to the players themselves. A great extension, perhaps for another time or another analyst, would be to look at if these breakout replacement players tend to repeat their performance or fall back to the pack. The idea of replacement players is very useful in the abstract, but it is also useful to think about them in reality. Specifically it is useful to think of them not just as point estimates, but as point estimates with larger than usual variation.

In my first stab at this topic I concluded that whether or not a replacement player would provide significant value was as good as a roll of the die. This more in depth look shows us that in just one season all players may be a roll of the die, but replacement players might be a roll of two.

. . .

All statistics courtesy of FanGraphs

Daniel Meyer is a junior at Colby College and Contributor of Beyond the Box Score and Batting Leadoff. You can follow Daniel on twitter @dtrain_meyer.