Beyond the Box Score: An SB Nation Community

Navigation: Jump to content areas:


Sports blogs for fans, by fans.
Around SBN: Cal RB Jahvid Best Seriously Injured, Carted Off Field

Determining the Best Runs/Win Formula

At some point Marc is going to start yelling at me for posting these technical, boring posts on BtB. Until he does, I'll continue writing about things the best runs/win estimators, which is the topic of today's post. In a nutshell, the importance of these methods (and there are a bunch) is to best convert a player's marginal runs (like runs above average or runs above replacement) into wins (above average or above replacement). Runs are a nice measurement, but baseball is all about wins and losses.

Star-divide

The most famous, and often-used runs-to-wins converter is Pete Palmer's, published in The Hidden Game of Baseball, and used in his Batter-Fielder Wins system. That converter is runs/win = 10*SQRT(RPG/9). RPG is runs per game, the average number of runs scored in a game. For example, in last year's American League, the average team scored 4.76 runs per game (and allowed the same), meaning that the RPG in the 2005 AL was 9.52. Plugging that into Palmer's runs/win converter, we get 10.28, meaning that the average team would need to score 10.28 extra runs to get one extra win, according to Palmer.

Is that right? We'll get to that in a second. First, let's quickly run down the other runs/win converter's out there. There are three in particular that I am familiar with, one from BaseRuns creator David Smyth, and two from noted baseball stat guy Tangotiger. Smyth's formula is the simplest, it's simply runs/win = RPG. It's not mean to be exactly correct, but just easy to use. In fact, if we look at last year's AL, it tells us that we would need 9.52 marginal runs to add an extra wins, which isn't all that different from Palmer's formula (though certainly, even though the difference between the two won't be more than half-a-win for any player, half-a-win is still worth almost $2 million on the free agent market).

Tangotiger's two formulas are pretty similar. The first is runs/win = .8*RPG + 2.4. That gives us a runs/win value of 10.02 for last year's AL, right between Smyth and Palmer, though closer to the latter. The second is runs/win = .7*(RPG + 5), which gives us a runs/win value of 10.16. So both of Tangotiger's formulas are closer to Palmer, though they both give lower values than the Hidden Game of Baseball author.

Now what is the correct value? How about I tell you first and then I explain? The correct value happens to be 9.97, which is best approximated by Tango's first formula. How did I determine that? Simple (or not so simple, depending on how you feel about calculating Pythagorean records with custom exponents).

The Pythagorean record is a team's expected record based on the number of runs it scores and allows. It takes the following form: W% = Runs Scored^Exponent/(Runs Scored^Exponent + Runs Allowed^Exponent). When Bill James developed the Pythagorean formula, he simply used an exponent of two, however since then, it has been shown that there are better exponents, and that the exponent is in fact dependent on the run environment. To determine the correct exponent based on the run environment, Smyth and a stat wonk who goes by US Patriot developed the Pythgenpat formula which is Exponent = RPG^.287. Exponents such between .278 and .287 have all been shown to work as well, but I like to stick to .287. It doesn't really change your answer very much, no matter which exponent you use.

Anyways, what's cool is that using this formula, we can determine the correct amount of runs it takes to gain a marginal win in any run environment. Here's the simplest (though not-quite mathematically correct, but so close it's more than close enough) way to do so: Take an average team in that run environment. For last year's American League, that would be a team that scores 4.76 runs a game and allows the same. Now add a very small total of runs, say .001 to its offense. How many more games will it win? Well, doing the math, we expect the team to have a .5001 W%, rather than be .500 (we actually need to use more decimal points for maximum accuracy). So that's .0001 more wins than expected. So how many runs would we have to add to get one more than expected? Well, simply divide 1 by .0001 and you get 10,000. Then multiply that by .001 (because remember, we added .001 runs to the offense, so really what this is saying is that we would need to add .001 runs 10,000 times for an extra win). The answer is 10. You would need to score 10 extra runs in a game last year in the American League to win one extra game. Remember, the actual answer is 9.97, if don't do any rounding.

Using this method, and the other estimators mentioned, I've done the math for every run environment between 1 and 20 RPG. Here is a graph of how the estimator's compare:

You can see that they are all very close when it comes to run environments that baseball is actually played in, which is why they are all usable. Nevertheless, their weaknesses are obvious if we look at the graph. Every estimator except for Smyth is too high at very low RPG levels, while Smyth is way too low. On the other hand, Smyth's estimator over-predicts once we get past 11 RPG. This is because his formula is linear, while the number of runs it takes to get a marginal win is not. Nevertheless, at least his formula is simple.

Palmer's formula is also terrible for weird RPG ranges. It over-predicts the number of runs needed to add a marginal win at the low RPG ranges, and way under-predicts at high RPG ranges. Essentially, it is only usable in normal ranges (though on the other hand, that really is the only place we ever use these formulas anyways).

Tangotiger's two formulas hold up better, though they have their own problems. His first formula gets very close to the true number at 4 or 5 RPG, but it begins to drift away at around 13. His second formula over-predicts badly at low RPG ranges, but is very close to the truth at high RPG ranges.

However, since you're really only going to use any of these formulas to evaluate players playing in real contexts, let's look at how closely these formulas track the truth in real run environments, between 8 and 11 RPG:

First, it's interesting to note how all the formulas converge at 11 RPG. You can see that Smyth's formula, while the simplest, is also the worst. Palmer's formula isn't as good as I would have thought it to be either. Tango's second formula is better, but his first takes the cake. It tracks the true number almost exactly. So when you want to convert a player's marginal runs into wins, it's best to use runs/win = .8*RPG + 2.4, in lieu of the true number.

In reality, however, none of these runs/win formulas are going to give you a very exact answer, and here's why. Each of these formulas (including the correct one) are based in an average context. They answer the question, "how many runs do you need to score to give an average team one extra win?" However, a player affects his own context, and if it's a good player, his affect is large enough to screw up these calculations. We added .001 runs to determine the correct formula; Albert Pujols adds 70 runs. Pedro Martinez takes away 30 runs. By virtue of being themselves, Pujols and Pedro change their teams' runs/win converters. In reality, we need to account for that fact as well. But that's the topic of a whole other article.

0 recs  |  Comment 5 comments

Story-email Email Printer Print

Comments

Display:

Good work David - as usual
I guess the question is even in environments where a single player dominates a team to what extent does this screw up the calc: how meaningful is it?

I should really work that out myself but I am feeling lazy and watching the Soccer World Cup

by John Beamer on Jun 25, 2006 3:03 PM EDT reply actions   0 recs

Thanks John
Let's take a league where an average team scores 4.5 runs per game, and allows the same. Now let's add an extreme player who is 70 runs above average. So all of a sudden, that team is scoring 4.932 runs a game. It's expected W% will be 0.543537166, making it 7.053 wins above average. So that player's runs/win converter is 70/7.053 = 9.92. The runs/win converter for an average team at 9 RPG is 9.58, so we would end up assigning that player .25 more wins than we should. So for the very, very best players, the impact is about a quarter of a win.

by David Gassko on Jun 26, 2006 12:40 AM EDT reply actions   0 recs

Interesting ...
OK - that figures given data I have seen elsewhere. It is enough to impact the run/win calculation - which is as expected. After all if an extreme player can't create more wins (and your equation is modeling correctly) what is the point of having him!

by John Beamer on Jun 26, 2006 2:32 PM EDT reply actions   0 recs

Slight Variation
Start with a league average team, then
calculate the increase in runs from
the PPE formula that results in 82 wins.
What you're doing is approximating this
number using differentials.

I get 10.04 for the AL, incidentally.

-Chris

by Eternal on Jun 26, 2006 4:49 PM EDT reply actions   0 recs

Chris,
Here's why that's not quite right. You're changing the context. It is no longer an average context. What if you add not one win, but 50? You'll find that the r/w is 15 (or whatever; I don't know the actual number). If you add one win, the context is altered less, but it's still altered. In fact, even .001 runs alters the context, just not enough to really change the numbers (if we're going to two decimal places). Tangotiger wrote about this best when discussing BaseRuns:

"Throw a pebble in a pond, and you will see some effects. The water will ripple on the surface in a particular pattern, the pebble will hit the bed causing whatever is on the floor to move to some degree. In all though, the equilibrium of the pond system has barely been altered. Drop a watermelon in an aquarium, and there will be much more changes. So much more changes that this aquarium system is no longer an aquarium system, but rather an aquarium system with a watermelon in it, and a new equilibrium will be reached based on the existence of the watermelon."

by David Gassko on Jun 26, 2006 5:37 PM EDT reply actions   0 recs

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Why be a member?
Start posting on Beyond the Box Score »

Join SB Nation and dive into communities focused on all your favorite teams.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Small
PZR-based Win Values 2001-2006

Recent FanPosts

Small
The "30 parks on a budget" challenge
Sunflower_small
World Series Simulation, Game #6
Small
JT20 Dynasty League
E52205a2_small
New Look
Sth70021_small
Exploring Hit f/x, Albeit Badly
Redcap_small
Ricky Nolasco: 4 WAR or 1 WAR?
Redcap_small
Apparently I can't do park adjustments
Small
Which tells us more: The last 7 at bats or 7 at bats against this pitcher?
Sleepy_jeff_small
How Efficient and Effective Were the Rockies in 2009?

+ New FanPost All FanPosts >

FanShots

Quick hits of video, photos, quotes, chats, links and lists that you find around the web.

Recent FanShots

The Mistake Lottery
On the Field, the Yankees Are The Team of the Decade. Off It? The Red Sox.
Tigers' all-time WAR leaders
Primer on Runs Created
How to improve basketball
LB Keith Bulluck uses a sabermetric analogy to explain the Titans' quarterback situation.
Alcides Escobar "abandoned his daughter before she was born"
UZR, Scouting, and the Fans
Not-So-Lousy Lineup Optimizer, Playoff Edition: New York Yankees

+ New FanShot All FanShots >

BtB on Twitter

Main Feed: @BtBScore

Tommy B: @tommy_bennett
Sky: @BtB_Sky
Dan: @dturkenk
Harry: @harrypav
Jinaz: @jinazreds
Jack: @jh_moore
Erik: @Erik_Manning
Tommy R: @trancel
Justin: @justinbopp

Subscribe to BtB via Email

Enter your email address:

Delivered by FeedBurner

Most Commented

Limes_125_small
Time To Move On
Aviles_small
Minnesota: Fielding TargetView Before & After JJ Hardy
Nando_small
A Complete and Lenghty List of Baseball-Related Things Miguel Olivo is Good At
770insig_small
Negative Team WAR - 2009 Edition
E52205a2_small
New Look

Managers

Nando_small R.J. Anderson

Limes_125_small Sky Kalkman

E52205a2_small Tommy Bennett

Editors

Face_small Harry Pavlidis

Rawlings_baseball_bigger_small Dan Turkenkopf

770insig_small Jeff Zimmerman (TucsonRoyal)

Aviles_small Justin Bopp

Authors

Banny_small erik

Raysring1_small Tommy Rancel

Jinaz-reds-avatar_small JinAZ

Jmlogo_small Jack Moore

1753738656_110919ebe9_o_small vivaelpujols

1_small Graham

Baseball_small Mike Rogers

Redcap_small SFiercex4

Small Patrick Clark

Walter_album_small Walter Fulbright