Filed under:

# Erector Set

I hope that some of you are familiar with my lineup simulator.  If not, take the time to scan this article.  In that article, I discuss a method to assign appropriate weight to OBP and ISO to create a modified version of OPS I oh-so-cleverly called mOPS:

mOPS = 2.5*OBP + ISO

I got a lot of comments that I was weighting OBP too heavily, to which I had the response: OBP is life.  That's not to say that situational hitting, having the right approach, and being aware of the game situation are not important.  But on a gross scale, the ability to avoid making outs is paramount.

One extremely valid point brought up by several readers was that the model determines a player's "worth" by creating a lineup of nine clones of that player.  The argument goes that this model inherently overvalues OBP because a high OBP player will eventually be driven in by his OBP clones, even if the player has only marginal power.

So I fixed it.

I remedied this problem by changing the simulation so that each player is modeled as batting third in a lineup of otherwise league average players.  I define league average on a yearly basis, so that the final numbers are era-adjusted (277/340/452 for AL 2004), but not positional- or park-adjusted.  I'll discuss possible methods for positional and park adjustments later.

I simulated each lineup 5000 times, but instead of looking at the average number of runs scored per game, I generated a run-scoring distribution.  I then projected how many wins these teams would accumulate in a 162 game season, given an average pitching staff, by weighting the run-scoring distribution by the run-dependent winning percentage.

Run-dependent Winning Percentage
2004 AL, averaged over all teams

Runs Scored    Winning Pct
0               .000
1               .080
2               .190
3               .332
4               .477
5               .607
6               .694
7               .739
8               .831
9               .904
10              .913
11              .965
12              .980+

For example, the average AL 2004 team won 33.2% of its games when it scored exactly 3 runs.  If a simulated lineup scored 3 runs in 10% of the time, then those games turn into (162)(0.10)(0.332)=5.4 wins.  For the vectorially inclined, I took the dot product of the run-scoring distribution and the run-dependent winning percentage.

The final result, for those of you whose eyes glazed over in the previous two paragraphs, was the number of wins an otherwise average lineup would accumulate in a 162-game season with a given player batting third and an average pitching staff.  To create an easily readable stat, I subtracted the number of wins a league-average player in an otherwise league-average lineup (in other words: a lineup of 277/340/452 hitters) with an average pitching staff to create a "Wins Above Average" number (WAA).  The league-average lineup picked up 78.7 wins.  WAA is not positional or park adjusted, but it is league/era adjusted.  Although it looks like a cumulative stat, it is actually a rate stat because I have assumed that each hitter plays for 162 games.

If you wanted to know the WAA for every player, I could run a simulation for every single player.  I could also scoop my eyes out with a spoon.  To make everybody's life easier, I regressed the WAA of 30 random players and ran a multivariate regression using OBP and ISO as the independent variables:

WAA = 41.9*OBP + 18.9*ISO - 16.9

The correlation coefficient between the formula and the simulated results is 0.963.  WAA appears to be linear between the extremely good and the extremely bad players.  Also notice that the relative weight of OBP to ISO is 41.9/18.9 = 2.2, as opposed to 2.5 in mOPS.  Those of you who said that mOPS overrated on-base skills and underrated pure power - you were right.

What this means is that a hitter that NEVER gets on base is worth close to -17 wins over the course of a season (Barry Bonds was worth +17 wins last year).  Ladies and gentleman, your 2004 AL leaders in WAA:

WAA Leaderboard, 2004 AL
Frank Thomas   +6.8
Manny Ramirez  +5.5
Travis Hafner  +5.4
Melvin Mora    +4.9
David Ortiz    +4.7
Gary Sheffield +4.2
Eric Chavez    +4.0

Remember, this is a rate stat.  I suppose you could prorate by number of games played or plate appearances.

Let's do a little test: let's recreate the 2004 lineup of a particular team and run the lineup simulator assuming the players play for 162 games.  Then, we'll compare the number of wins the simulator predicts with a simple sum of WAA.  I'll randomly pick - oh, I don't know - the Oakland Athletics with the following lineup:

NAME              OBP/ ISO/ WAA

Mark Kotsay      .370/.145/+1.3
Eric Byrnes      .347/.184/+1.1
Eric Chavez      .397/.225/+4.0
Jermaine Dye     .329/.199/+0.6
Erubiel Durazo   .396/.202/+3.5
Damian Miller    .339/.131/-0.2
Scott Hatteberg  .367/.136/+1.0
Bobby Crosby     .319/.187/+0.0
Marco Scutaro    .297/.120/-2.2

Recall from comments in my last article that the lineup order does not really matter all that much (my results in this area agree with a series of Baseball Prospectus articles).

The sum of all the WAA for this lineup is 9.24, which translates to roughly 88 wins with average pitching.  The simulation results in 86.7 wins, which shows that WAA is linear enough that is a good way to estimate the strength of an offense, given ONLY common rate stats.  So this hot stove season, if you want to see how your team's offense will be affected by adding and subtracting hitters, use WAA.

Like all my favorite stats, it is easy to calculate and easy to interpret.  So if you'd rather not look up EqA, VORP, or another of the myriad alphabet soup stats, WAA is a quick-n-dirty way to estimate offensive value.  After the season, I'll post some historical data for the NL as well as the AL.

Postscript: Given the nature of the simulator, park adjustments can be incorporated by using park-adjusted statistics for each player.  I assume these are available on Davenport Translation cards.  The easiest method for positional adjustments that I can think of is to simply calculate WAAs for the entire league, group them by position, and subtract out the average WAA on a position-by-position basis.