To this day, I still marvel at how Sean Smith gave us (and then gave Baseball-Reference) a complete dataset of historical Wins Above Replacement from 1871 to the present. My one teenie tiny issue with it is the fact that it's not actually 100% complete.
While American League, National League, Federal League, American Association, Players League, and even Union Association data is complete, the data for the National Association (1871–1875) is incomplete. It's not that the National Association was intentionally left out, either. Position player WAR data exists. Pitching WAR does not, however.
As I'm working on the new version of the Hall of wWAR, I just can't completely ignore these pitchers, though. I need that WAR. So, I decided I'll just estimate it (using the same basic methodology of Baseball-Reference's pitching WAR).
So, for the first time ever, here are the National Association career pitching WAR leaders:
(Complete list after the jump.)
Al Spalding is in the Hall of Fame as an executive/pioneer. Like John McGraw (and, in the future, Joe Torre), he was good enough that he really should be in as a player. Now the problem is that people only know Spalding the pioneer or McGraw the manager or Torre the manager. They were each Hall of Fame players. Who's the pitcher most similar to Spalding (in terms of WAR and IP)? That'd be Roy Halladay and his 61.8 WAR in 2531 IP.
Candy Cummings is often criticized as a poor Hall of Fame choice. He is essentially in the Hall because he supposedly invented the curveball (a fact that his widely refuted).. Between that debate and the fact that he's in as a pioneer, it really takes the focus off his pitching career (which just may have been Hall-worthy). Cummings' similar pitchers by WAR and IP include Hall of Famers Lefty Gomez, Hoyt Wilhelm, Addie Joss, and Goose Gossage as well as guys on the bubble like Ron Guidry and Sam McDowell. Cummings also had a flashy amateur career before the National Association was formed.
Bobby Mathews famously has the most wins for a pitcher not in the Hall (297). His post-NA WAR of 16.1 looks better when you add his NA numbers (for total of 59.7 WAR). His closest WAR and IP comps are Tommy John and Mickey Welch. I couldn't come up with better comps if I tried.
Dick McBride was a workhorse pitcher/manager in his five seasons with the Philadelphia Athletics. From 1871 to 1875, he accounted for 89%, 100%, 81%, 100%, and 78% of his team's innings. McBride was worth 36.7 WAR in the National Association while fellow hurler George Zettlein was worth 36.4 WAR. Both pitchers—whom I'd never heard of until this project—had long amateur careers before joining the National Association (dating back to the Civil War).
I should probably explain how I came up with this. I'm no math whiz, but I think this is a decent estimate.
- For each pitcher, I collected his innings pitched and his total runs allowed (not unearned).
- For his team, I collected the total number of innings pitched and Total Zone runs.
- For his league, I collected the average number of innings and average number of runs allowed.
- I found the league average for runs allowed per inning pitched.
- I then calculated the number of runs each pitcher should have allowed (using his innings pitched) if he was a league average pitcher with a league average defense.
- I then split up each team's Total Zone runs and assigned them to each pitcher on the team based on his innings pitched.
- I adjusted the pitcher's runs allowed based on his defense's contribution. If his defense was worth five runs above average, I added those five runs to his runs allowed total (since he theoretically would have allowed them, given an average defense).
- I subtracted his runs allowed (adjusted for defense) from the number of runs he should have allowed, given average performance with average defense. This is his runs above average.
- (Note: This is the inexact science part) I consulted with Matt Klaasen about how to convert runs above average to runs above replacement. He said that replacement level runs allowed is typically 120 to 130 percent of runs above average. He also warned me that the difference between average and replacement may have been smaller in the 1870s. I went with the low end (120 percent), but it could conceivably still be lower. Any input here is appreciated (and I could re-run the numbers).
- I then used Matt's post about historical run environments to convert from runs above replacement to wins above replacement.
Big thanks to Matt Klaasen for helping me out with that last part.