clock menu more-arrow no yes mobile

Filed under:

Predicting wOBA

Offensive statistics are not nearly as volatile as the metrics we have for pitchers. Does that mean we should ignore the idea of projecting offensive numbers completely? I don't think so.

Richard Mackson-US PRESSWIRE

Much of the baseball research that I've done over the past year or so has dealt with pitching metrics, particularly ERA estimators.

But lately, I've begun to wonder why I've spent so much of my time on pitching metrics, rather than offensive numbers.

The first and most obvious thought that came to mind was that we know a great deal about offense.

We're fairly certain at this point that metrics like wOBA, TAv and wRC+ do a fantastic job of explaining how runs are produced and give credit where it is due.

On the other hand, weeding out exactly who deserves run prevention credit in the pitching/defense relationship is far from being fully understood. Thus leading to more work on the side of pitching metrics/estimation.

However, the best idea that I've heard for why pitching estimation becomes a major focus, came from a discussion I had with Beyond the Box Score's retrosheet guru, James Gentile. In that discussion, James noted some work that Bill Petti had done with Year-to-Year correlations for this site.

In Petti's piece on hitting metrics, he found that the year-to-year correlation for an individual's wOBA to be r = .61; which is pretty high. The year-to-year correlation that he found for ERA was much lower (r =.38).

Petti went on to explain what these differences mean in greater detail:

If a statistic in year one does not correlate all that well to itself in year two it generally is more descriptive than predictive.

This is the underlying logic behind pitching metrics like DIPS and FIP. Since ERA has a generally low year-to-year correlation (.38), it was a poor predictor of future ERA has a generally low year-to-yearformance and true talent.

These ideas helped back James' inclination as to why many (including myself) typically focus on predictive models for pitching statistics, rather than offensive numbers. Seeing as one of the most commonly used descriptive measures of offense, wOBA, works fairly well as a predictor of future performance (true talent level).

The wOBA correlation that Petti found was pretty strong, but at the same time I had a few ideas of my own, for ways to modify the equation to be more predictive.

For those who don't know the wOBA equation this is the 2012 version of the statistic shown at FanGraphs:

wOBA = (.691*NIBB + .722*HBP + .884*1B + 1.257*2B + 1.593*3B + 2.058*HR)/(PA-IBB)

The individual weight for each component varies from year-to-year, as the correlation with runs scored is slightly different, although they don't vary by a large margin.

My main issue with using this formula to predict is the large weights given to doubles and triples. Obviously when attempting to describe offensive production (run creation) extra base hits like doubles and triples should receive a great deal more weight than walks or singles. At the same time, that doesn't mean extra base hits should receive as much weight in a predictive model.

According to the great work done by Russell Carleton over at Baseball Prospectus, walks, hit-by-pitches, singles and home runs rates for batters reflect true talent level, within one season (~300 PA). While it takes three to four seasons (1610 PA) for doubles and triples to reflect true talent.

Walks and home run rates are the fastest of wOBAs components to stabilize. So I would expect them to receive more weight within a predictive version of wOBA, while I'd expect doubles and triples to receive significantly less weight.

Doubles and triples take longer to reflect true talent level because of variation on balls in play. The question of true talent on balls that hitters put in play led FanGraphs' Bradley Woodrum to derive a fielding independent version of wOBA that uses xBABIP to attempt to stabilize BABIP luck for hitters in a wOBA equation.


The hypothesis behind this piece is that if the wOBA components are regressed against future wOBA; thus, are weighted to predict rather than describe, we'll see doubles and triples with less weight (compared to the original formula) and home runs and walks with more weight.

I've worked with and discussed this concept in great detail with my work in developing a predictive version of the FIP equation. I think there's a chance that the same concepts could be applied to wOBA and result in greater predictive strength.

The Study

In Petti's original study, he used the seasons 2001-2008 and batters who had at least 300 plate appearances in consecutive seasons. The sample for this piece was slightly different as I used the years 2004-2012 and upped the minimum number of PAs to 400 (n= 1238).

Interestingly, I found a lower correlation (r = .54) for year-to-year wOBA than Petti found.

A correlation of .54 is still fairly high for a descriptive metric, but there's still obviously room for improvement.

The first idea I had was to use the r found in the study to regress each individual's wOBA back to the mean (I've discussed this strategy in greater detail here). This strategy uses each individual's wOBA and multiples that number by .54 (as that is the assumed percentage of true talent) and then adds that number to league average wOBA multiplied by .46.

This regression to the mean did not result in a stronger prediction, as the overall correlation (r=.53) went down from simply using the individual's number.

The next idea was the main goal behind this piece; weighting each of wOBAs components to be predictive rather than descriptive.

For this step, I ran a multiple regression with the components (NIBB/(PA-IBB), HBP/(PA-IBB), 1B/(PA-IBB), etc.) as the predictors and wOBA in the subsequent season as the dependent variable.

The result was this equation

Future wOBA = .190 + (.43*NIBB + .38*HBP + .30*1B + .51*2B + .62*3B + 1.07*HR)/(PA - IBB)

This equation only had a slightly stronger correlation (r=.57) than simply using an individual's non-regressed wOBA.

So while it did improve the prediction, the overall improvement wasn't too great. At the same time, the equation itself is fairly interesting.

The constant is a good representation of regression towards the mean. The number (.190) is equivalent to 58 percent of the overall mean wOBA for the years in this sample; which seems rather high. However, the fact that in order to qualify each hitter had to have at least 800 PAs across two seasons, it'd be reasonable to assume that this sample had a higher overall mean.

Each component's weight is regressed a good deal. As expected the relative weights of each component line up well with the finding in Carleton's study on the speed at which metrics stabilize.

I attempted to better illustrate this point with a table of the weights for descriptive and predictive wOBA relative to each other:


Des. wOBA Re. Wt.

Pre. wOBA Re. Wt









0.16 (15%)




0.56 (44%)




0.61 (34%)




0.80 (36%)




0.37 (13%)

**Note**-- the weighting of walks was used as the reference for each of the other components' weights to be compared to.

As expected, the statistics that stabilize the most quickly (HRs and HBPs) received the smallest reduction in relative weight., while the more volatile components (balls put in play) received much larger reductions.


This predictive version of wOBA really did not add much more predictive value over simply using the regular descriptive version of the metric. Which is surprising to me because I didn't test this regressed equation out of the sample.

What I mean by that is, the equation I found isn't actually predicting future wOBA, but instead just explaining future wOBA for the players in this sample. So, the only way of knowing if this future wOBA equation actually adds predictive value would be to test it on a different sample.

Honestly, I wasn't nearly as excited with the results of this piece as I was with the idea behind it. I had hoped that "p wOBA" could be a real useful tool, but I digress.

However, I think there is one key takeaway from this piece that I'd like the readers to remember.

The differences between these two versions of wOBA can be used as a reinforcement of which statistics (the less volatile ones) should be weighted more when you're attempting to predict rather than describe performance.

All statistics come courtesy of FanGraphs

You can follow Glenn on twitter @Glenn_DuPaul