Beyond the Box Score: An SB Nation Community

Navigation: Jump to content areas:


Sports blogs for fans, by fans.
New Blog: The Boxing Bulletin for Boxing Fans!

Blez has a question, we have an answer

Our supreme overlord and all-around good guy Blez recently asked the BtB All-Stars:

You know something I was thinking about the other day...people always talk about how in a 162-game season, the cream always rises to the top. I was wondering, 162 seems like such an arbitrary number to say that about. How do we know if we didn't make it 200 or an even 150 or maybe even 125 that the best teams will always win out? I was thinking that could be a good topic for your site. If it went to 200 games, perhaps the indians win out. Or even less, the A's were in first place in August. I don't know, just a thought and I felt it would be an interesting topic to look at.  The assumption is that the best team will always win out after 162 games.  But why is that assumption made?

The answer to Blez's question, of course, lies in the Pythagorean theorem and the length of the season.  Most work that has investigated the distribution of wins around the Pythagorean expectation is based on empirical data.  In a way, this is the way things ought to be, since baseball is an empirical game.

But sometimes it is fun - and informative - to skip past all of the empiricism and do some thought experiments.  Blez's question immediately brought to mind a couple of thought experiments that I thought we could do together.

Star-divide

Question the First: How do we know that the Pythagorean theorem is right?  

I've talked a lot recently about run distributions, so you should know by now that the distribution that describes the way teams score and allow runs is called the Weibull distribution.  Using some mathematical gymnastics described in Professor Miller's paper, we can show that, on average, a team that scores an average of RS runs and allows an average of RA runs will have a winning percentage described by the now famous Pythagorean theorem:

Question the Second: But teams don't always have record that line up with their Pythagorean winning percentage.  For a team that allows and scores runs as described by the Weibull distribution, how likely is it that they will actually end up with their Pythagorean record?

Let's do our first thought experiment.  Imagine a team that allows and scores, on average, four runs per game.  The distribution with which they allow or score these runs is described by the Weibull curve (although, as it turns out, that won't matter).  I reach into the runs allowed distribution and pick out a number: I get four.  Now I reach into the runs scored distribution and pick out another number: I get five.  This team has just "won" the first game of our thought experiment, 5-4.  We can repeat this a million times, and since the team, on average, allows 4 runs and, on average, allows 4 runs, they will end up winning 50% of their games as predicted by Ancient Greek and noted baseball fan Pythagoras.

But what if we repeat this only 10,000 times?  Or 1000?  Or 162?  Teams won't always meet their Pythagorean fate, but most come close.  It's easy to imagine that there is a distribution around a Pythagorean record, and that this distribution has a certain width.  Those with a statistics background will recognize that the distribution around the Pythagorean projection will be normal (shaped like a bell-curve) as a consequence of the Central Limit Theorem.  The width of this curve will be defined by the standard deviation, which basic statistics tells us will be 1/[2 x sqrt(N)], where N is the number of games.

[Aside: Yes, we are assuming that runs scored and runs allowed are independent, which is not strictly the case, but is true enough for our experiment to illustrate a point.]

Let's look at a graph.

The peak at x = 0.500 shows that the most probable situation is that, after 81 games, a team will have a .500 record.  But the shaded area (the area under the curve, or the integral) shows that there is a 11.1% probability that a team whose "true" ability lies at .500 can go 46-35 (or better) after 81 games, on pace for 92 (or more) wins and possible playoff berth. That's not insignificant.  On the flip side, there is a 11.1% chance that this team will stumble to a 35-46 (or worse) record after half a season and the GM will start trading off expensive veterans.

After 162 games, however, there is only a 4.2% chance that a "true" .500 team will pull off a 92-win season or greater (once again, the shaded area).  

Notice how the peak is sharper and the shaded area accounts for less of the overall area under the curve.  As the peak becomes sharper and sharper, the shaded area will account for fewer and fewer of the possible outcomes.

To begin to answer Blez's question, the reason why people say that the cream rises to the top is because, in a sense, it does over the long haul.  In the year 2020, when Bud Selig's sentient toupee has extended the regular season to 1,000 mega-games per hyper-season, we will find that there is only a .0000086% chance that "true" .500 team plays the equivalent of 92-win ball (568 uber-wins in the future).  The longer the season, the more narrow the distributions get, and, as I said earlier, the standard distribution of winning percentage is always 1/[2 x sqrt(N)].  That's why baseball has a long regular season - to reduce the chance that a truly mediocre team can come out looking like a playoff contender.

Question the Third: How well does a 162-game season do in separating the good teams from the bad teams?  The excellent teams from the good teams?

Let's do our second thought experiment.  Imagine a good team (a 90 Pythagorean wins team) and a terrible team (a 90 Pythagorean loss team).  Neglect, if you will, things like in-season injuries, trades, and desperate Jose Lima signings in the middle of the year. We'll have to neglect the fact that one team's wins are not independent from another team's wins since they actually play each other.

What is the likelihood that the cream does not rise to the top - that is, how likely is it for the terrible team to finish with a better record than the good team?  If the two teams' winning percentages are independent of each other, then we can compute the probability that the good team will finish with a worse record than the terrible team over 162 games.

For example, we can compute the probability that the good team will have a winning percentage of exactly .500 (1.3%) and that the poor team will have a winning percentage of exactly .510 (1.1%).  The probability that both of these events occur simulataneously is the product of these two probabilites (1.1% x 1.3% = .014%). We can do this for all the different situations in which the bad team finishes with a better record than the good one - all infinity of them - and add them up. The end result is the (nerd alert!) integral:

where f1(x) is the probability distribution that the good team has winning percentage x and f2(y) is the probability distribution that the terrible team has winning percentage y.  (For those of you who aren't familiar with integrals, well, good luck in your love life.  Us Sabermetric Spocks get unbelievable booty.)

Using this equation we can compute the probability that the terrible team has an equal or better record after 162 games as 2.3%. Now let's take Blez's hypotheticals: what if the season were only 100 games long?  The probability of that the terrible team has an equal or better record goes to 5.6%.  And 200 games?  1.3%.  Once again, we can see the value in playing a long season.

Another thought experiment - imagine a very good team, one that plays at a Pythagorean level of 95 wins, in a pennant race with another very good team that plays at a Pythagorean level of 98 wins.  Over a 162-game season, the better team only finishes with a better record 63% of the time.  That is huge chunk of hypothetical seasons - nearly 1 in 3 - in which the better team does not finish with the better record, by mere chance alone!  We can plot the probability that the terrible team finishes with an equal or better record than the good team as a function of season length.

Yes, even after 1000 games, the "true 95 win team" still stands better than a 1 in 5 chance of finishing with a better record.  In this case, to be 99% positive that the better team finishes with a better record, you would have to play 8100 games.

The answer to Blez's question is - and I know this will irritate a lot of you - "Because of the sample size."  By using some binomial statistics, we can show that the best team, Pythagorically speaking, does not always finish with a best record, and that the probability that they do not is non-trivial.  We know that some teams escape their Pythagorean fate, and we have some evidence that leveraging relievers is one way to outperform your Pythagorean projection.  Combined with random happenstance, it's not unusual at all that a good team can slip into the playoffs while a slightly better team stays home in October.  The more games teams play, the less frequently this will occcur.

(side note: The Indians had 96 Pythagorean wins, and the White Sox had 91.  In about 30% of 162 game seasons, a 91 Pythagorean Win team will come out ahead of a 96 Pythagorean Win team.  Of course, the White Sox were the ones hoisting the trophy and flashing the bling last week, and I don't think they give whits about Pythagoras.)

Question the Last: You smell bad.
Yeah, but I can integrate like a motherf---er.

0 recs  |  Comment 8 comments

Story-email Email Printer Print

Comments

Display:

How about what happens
when you compare 154 game seasons to 162 game seasons? Just sayin'.

Looks like we have some strong evidence to not only support the wild card playoff format, but to expand it. How about we get cyber-selig to add 1 playoff team per league, the team with the best pythagorean record that isn't in the playoffs!

by cephyn on Apr 10, 2006 6:37 PM EDT reply actions   0 recs

154 vs 162
It is a fair question, but it doesn't change the answer by much.

How about we get cyber-selig to add 1 playoff team per league, the team with the best pythagorean record that isn't in the playoffs!

I know the idea is part joke, but I do think that one of the exciting things in baseball, for me, is to see unusual things happen.  Since I happen to be a stat-fan, I actually like to see things like good Pythagorean teams get beat out (unless they're the A's!) by lesser teams.

by salb918 on Apr 11, 2006 11:12 AM EDT up reply actions   0 recs

I forgot to tell you
by how much the answer changes.

For the 98 Pyth win team versus the 95 Pyth win team, the lesser team will finish with an equal or better record 37.1% of the time in a 162 game season and 37.4% of the time in a 154 game season.

By way of comparison, over 6 games (i.e., the season's first week), a 90 win team will be matched or outperformed (in independent contests) by a 90 loss team 34% of the time.

by salb918 on Apr 11, 2006 11:16 AM EDT up reply actions   0 recs

that last line
scares the cubs fan in me!

by cephyn on Apr 11, 2006 12:33 PM EDT up reply actions   0 recs

born and raised.
tragic really. i may never forgive my father and grandparents.

by cephyn on Apr 11, 2006 1:53 PM EDT up reply actions   0 recs

That's okay.
My dad raised me a Buffalo Bills fan in the mid to late eighties.  That didn't work out so hot, either.

by salb918 on Apr 11, 2006 2:01 PM EDT up reply actions   0 recs

Acknowledgements
I do need to acknowledge my fiancee, who used her superior math skills to help me when I got stuck (I'm not nearly so smart).

by salb918 on Apr 11, 2006 4:53 PM EDT reply actions   0 recs

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Why be a member?
Start posting on Beyond the Box Score »

Join SB Nation and dive into communities focused on all your favorite teams.

FanPosts

Community blog posts and discussion.

Recent FanPosts

Leopold_butter_scotch_southpark_small
Using the TVC
Small
Determining Batted Ball Rates using Pitch Type and Location
Small
a new xBABIP calculator
Img587561916661595
Top 15 high school MLB draft prospects
Small
PZR-based Win Values 2001-2006
Small
The "30 parks on a budget" challenge
Sunflower_small
World Series Simulation, Game #6
Small
JT20 Dynasty League
E52205a2_small
New Look
Sth70021_small
Exploring Hit f/x, Albeit Badly

+ New FanPost All FanPosts >

FanShots

Quick hits of video, photos, quotes, chats, links and lists that you find around the web.

Recent FanShots

Primer on BaseRuns
Cool Baseball Infographics
ESPN's Jerry Crasnick on defensive metrics
I’m also a follower, since Brian Bannister’s on our team, of sabermetric st...
Top Ten Baseball-Reference.com's Sponsorships
Primer on Linear Weights
JC Bradbury on "Hot Stove Myths"
Everyone Should Learn to Throw a Cutter
Criminals of WAR
Ten statisticians you should know about

+ New FanShot All FanShots >

BtB on Twitter

Main Feed: @BtBScore

Tommy B: @tommy_bennett
Sky: @BtB_Sky
Dan: @dturkenk
Harry: @harrypav
Jinaz: @jinazreds
Jack: @jh_moore
Erik: @Erik_Manning
Tommy R: @trancel
Justin: @justinbopp

Subscribe to BtB via Email

Enter your email address:

Delivered by FeedBurner

BtB Goes Social


Managers

Nando_small R.J. Anderson

Limes_125_small Sky Kalkman

E52205a2_small Tommy Bennett

Editors

Face_small Harry Pavlidis

Rawlings_baseball_bigger_small Dan Turkenkopf

770insig_small Jeff Zimmerman (TucsonRoyal)

Aviles_small Justin Bopp

Authors

Banny_small erik

Raysring1_small Tommy Rancel

Jinaz-reds-avatar_small JinAZ

Jmlogo_small Jack Moore

1753738656_110919ebe9_o_small vivaelpujols

1_small Graham

Baseball_small Mike Rogers

Redcap_small SFiercex4

Small Patrick Clark

Walter_album_small Walter Fulbright