This being Beyond the Boxscore and all, I'm sure that many of you are familiar with FanGraphs' WAR (Win Value) stats. However, some of you may find this primer helpful in figuring out how to calculate WAR and why the stat is such a good one. In any case, here's my primer for pitchers, previously posted on Purple Row...
Last week's session of Purple Row Academy focused on the concept of the value statistic Wins Above Replacement (WAR), also known as Win Value. While that article detailed the components of WAR for position players, this one will delve into the calculation of WAR for pitchers. Once again, I'll be using the invaluable work of Dave Cameron at Fangraphs and of our own RMN to help explain these concepts.
Remember that WAR relies on counting stats (Ks, RA, BB, etc.) to be translated into rate stats (K/9, FIP, etc.) and then combines these factors into value stats. Calculating the value of a pitcher is much more difficult than doing so for a position player simply because the number of variables involved and how dependent a pitcher is on the environment in which he plies his craft. A fly-ball pitcher in Citizen's Bank Bandbox will likely do very poorly...and if he has the Nationals' "defense" behind him, he'd be doomed (DOOMED) to fail.
While you won't know everything there is to know about pitching WAR after reading this article (I know that I still don't), you'll at least be able to understand a little bit better the valuation process. The WAR calculation for pitchers has five main components.
The basic purpose of the FIP metric is to remove a team's poor (or good) defense from the calculation of a pitcher's relative value and to therefore figure out the pitcher's role in his team's run prevention. The theory is that a pitcher only truly controls three major categories of outcomes: homeruns, walks, and strikeouts (well, and HBP, but we lump that in with walks). As a result, all hits given up by pitchers that don't leave the yard are nullified in FIP's calculation. RMN covered FIP in a Counting Rocks article a while back.
FIP calculates a pitcher's performance regardless of context--in other words, a strikeout when the bases are empty in a blowout is counted the same as a ninth-inning strikeout with the bases loaded in a one-run game.
The FIP formula is:
In addition, a league-specific (AL or NL) factor for the year is added to scale the data properly to league average RA. Interpreting the formula, a homerun given up by a pitcher is 4.25 times more damaging to his team than a walk. Meanwhile, a strikeout (the only positive outcome for a pitcher) is given the lowest weight.
A pitching metric not currently used in WAR calculation but one that has been gaining steam of late is tRA, another Counting Rocks topic.
Takeaway: FIP is a weighted stat that only uses HRs, BBs, and Ks--removing context, hits within the park, and defense from the equation. The FIP used in WAR is scaled to RA, not ERA.
Harkening back to last week's column, the replacement player is NOT league average. The replacement player is a AAAA player, a waiver claim, or a Rule 5 guy that has somehow made his way to the major leagues. He is cheap and eminently replaceable, a player that costs your team two wins by playing him over a league average player. He is Matt Belisle.
However, a replacement-level pitcher is different from his position player counterpart in that starting and relief pitchers are also valued differently. A replacement-level starter is a guy that did not start the season in the rotation or did only due to injury or was a top prospect.
A replacement starting pitcher can be a serviceable bullpen guy (see Fogg, Josh or Jennings, Jason), but the reverse is usually not true. TangoTiger calculated the difference between a replacement-level starter and reliever to be -.090 wins (or a 9% reduction in win expectancy).
Here's what Cameron has to say on the subject:
Relievers are, in general, failed starting pitchers who are given an easier task that their skillset will allow them to handle. They are selectively managed to face hitters whom they have the best chance of getting out, and they get to throw at maximum effort on nearly every pitch, giving them greater velocity over their shorter appearances.
Nearly every starting pitcher in baseball could be a useful relief pitcher. Very few relief pitchers could be useful starting pitchers.
The calculation of a replacement pitcher, therefore, is split into a reliever and starter component. For starters, the win expectancy number used is .380 while a reliever's replacement value is .470. These numbers are .370, .460 in the AL, and .390, .480 in the NL. Where do these numbers come from? To understand that, it would take math that is certainly more complicated than the hitting component.
Remembering that team comprised completely of replacement players has an expected winning percentage of .300, one assumes that through the multiplicative effect that a balanced replacement team will have a .395 win expectancy offense and a .395 defense (the combination is less than the sum of its parts). For you math nerds out there, TangoTiger explains the odds ratio method and why WARP1 is wrong.
What this means is that if a team has league-average performance across the board (hitting, relief pitching, defense), then they would expect to win a mere 38% of games started by their replacement-level SP. If the team had average hitting, defense, and starting pitching, they would win 47% of the games in which their poor bullpen pitchers were used. Therefore, a team running out a poor starter is costing itself 9% more of win expectancy than if it used a poor bullpen pitcher. A league average starter wins 50% of their games, so using a replacement starter decreases win expectancy by 12%.
Takeaway: Replacement-level starters can be useful relievers, Matt Belisle is neither.
The replacement level weight of a pitcher and his FIP, while important, doesn't tell the whole story. Due to both the differences between the AL and NL plus the inherent variability of offensive production from year to year, nailing down a consistent replacement level for FIP irrespective of role and year is problematic.
As a result, the run environment of a particular year must be taken into account when calculating a pitcher's value. To find the performance of a league-average pitcher, Cameron accounts for the pitcher's RA (since the WAR version of FIP is scaled to RA). To do this, you divide a pitcher's FIP by .92 (basically assuming that only 92% of runs scored are earned runs). For example, AL teams averaged 4.78 runs/game/team last year...so a league-average FIP performance (his team wins 50% of games started by him) would be 4.40 (4.78*.92). In the NL there were 4.54 runs per game per team, leading to a league-average FIP of 4.18.
Remembering that a replacement level pitching performance costs a team two wins, the replacement should be worth twenty runs below average using the 10 runs/win rule, right? Unfortunately, this is not necessarily the case. The actual conversion from runs to wins (explained below) varies from year to year.
Using 2008 data, for an AL team a replacement level RP had a RA-scaled FIP of 5.09 while the RL starter produced a 6.12 RA-scaled FIP. In the NL these numbers were 4.84 and 5.84. It must be noted that these numbers are park neutral--a RL pitcher at Petco will have a much lower FIP than one at Coors--this component will be added later in the WAR calculation. The higher numbers in the AL are due to the more potent offensive environment created by the AL due to the DH. Therefore, a RL pitcher in the NL will look better than a RL pitcher in the AL due to the less dangerous run environment in the NL.
Takeaway: A pitcher's replacement level varies by year and by his league's run environment. RL relievers are expected to have much better FIP than RL starters.
Looking at the complexity of this calculation for pitchers makes me yearn for the simplicity of it for position players (10 runs away from average per win). After all, the run environment a pitcher pitches in has a pretty big impact on his numbers. Fangraphs uses a formula by TangoTiger for its run-conversion metric:
((League RA + Pitcher’s RA)/2)+2)*1.5
Unfortunately, I am unable to decipher exactly why the formula uses the constants of +2 and *1.5...but the fact is that the formula spits out some reasonable numbers.
Applying this to the AL for 2008, the run-conversion (again, number of runs from average per win) was 10.17 while in the NL this number was 9.81. What is important to note though is that a pitcher with a low FIP has created a different run environment by himself. As such, a NL pitcher with, say, a 3.5 FIP would have a run to win conversion factor of 9.03--much lower than the league average.
Takeaway: A pitcher's run to win conversion as calculated in WAR is customizable (and dynamic)--a better pitcher creates a run environment that is conducive to low scoring and a poor pitching performance begets more offense.
As far as pitching performances and the value therein go, the home park's factor is a very important factor. After all, a pitcher in Petco would be expected to put up a better pitching line due to the run repression of the park while a pitcher at Coors would expect worse numbers.
For the purpose of WAR, the park factor is each field's unique run environment expressed as a decimal of 1.00 (which is average). A park with a factor below one suppresses runs while a park above one would be conducive to higher scoring games. Petco has a park factor of 0.92, meaning that the park lowers scoring by 8%, while Coors' 1.09 factor means that 9% more runs than average are scored there.
A pitcher's WAR adjustment is calculated using a half season of his home park factor. In the WAR calculation, a park's factor is determined using a five-year regression method to minimize year to year anomalies.
Takeaway: Park factors take into consideration both the home and visiting teams' performances, thus only half of the home park factor is attributable to the home pitcher. For his visiting innings, use the league average number.
See the above link for a sample step-by-step, from scratch calculation of WAR. I would do one myself but unfortunately I'm still not able to understand exactly how the mechanics of the park and run conversion factors are applied. If any of my loyal readers are able to decipher the precise reasoning behind Cameron's calculations feel free to inform me in the comments.
Like I said at the top, pitching WAR is a very complex calculation, one which even with thorough examination I've failed to completely understand. What I do know is that once a pitcher's WAR is calculated, the conversion of his production into dollar values is the same as for a position player ($4.5 million per win for this year).