clock menu more-arrow no yes mobile

Filed under:

The beauty of small samples

Ah, the joys of the early April. The baseball season is not even a dozen games old and already the beacon of statistical randomness is shining brightly. Would even the most ardent Milwaukeean have predicted that the Brewers would start 5-0; or that as of April 12th those perennial NL Central underachievers, the Cincinnati Reds, would have ratcheted up a .726 win percentage? No. But that is the beauty of Baseball. Analyzing only a few games a whole range of bizarre conclusions can be formed.

Sabermetricians usually despise small sample sizes, but for the next few hundred words at least, we'll make a virtue of them. We'll look through our small sample lens at how the season will shake out if the trends that we have witnessed over the last couple of weeks continue for the next 140 games. Will the Yankees continue to dominate the AL East? Can the Braves rack up win number 15? Will the Royals burst through the magical 63 win barrier? Read on and find out.

Take a look at the AL standings as they are today (April 12th):

OK, calculating the Pythagorean projections the standings look like this:

There are no surprises in the AL East, the Yankees and the Red Sox come out on top. Even though the Yankees are only at a .429 win%, on a Pythagorean basis they will win yet another divisional title, largely because they scored a bunch of runs against the As on the opening day. Take pity on the Red Sox again. Despite rocketing to a 6-1 start they only win 107 games, come second, and miss out on the wild card! Perhaps the famous Curse is poised to rear its head again. At the bottom of the table the mighty Devil Rays slump to a 110 losses. Hopefully a change of name will help next year!

In the AL Central the surging Tigers are on schedule to win an astonishing 130 games, having scored twice as many runs as they have allowed! Don't you just love small samples? The Indians also look in good shape and should win over 110 games which, on current form, is good enough to secure a trip to the post season via the wild card. Incidentally, the Tribe was my preseason pick to win the World Series, so from a purely selfish point of view it is good to see them start strongly. Who is bringing up the rear? Hey-ho, it is the hapless Royals. They look to be coasting to another 100 loss season, which will disappoint Ron at breaking-100 - a blog which tracks Kansas' almost doomed attempt to win 63 games this season. Come on, it should be easy; they are only 2 games back from .500!

Before the season began many analysts had pegged the AL West as one of the toughest divisions in baseball. Both the As and the Angels were valid World Series picks, the Rangers had bolstered their rotation and the Mariners had upgraded in several positions. So who would have thought that a mere 8 games in that the As project to win the division with a .476 winning percentage! That's right, every team will post a losing record. And those poor folk from Texas are going to be commiserating with each other 110 times. Ouch.

Now check out the NL current standings (again, as of April 12th):

And the Pythagorean projections:

Will the Braves rack up win number 15 in the NL East? Not according to our trusty friend Pythagoras. They are projected to finish over 30 games behind the Mets, who look set for the World Series with the best record in the NL! Nah, I don't believe this one. Until the Braves actually lose a title I know where my money is - they have been written off too many times already. Incidentally Braves followers have definitely had a roller-coaster start to the season; they are both the best offensive and the worst defensive team (OK, excluding the horrid D-Rays) in all of baseball. Leo Mazzone is quickly proving he was the most underpaid man in the game.

Can the Brewers win 118 games this season? No, but two games ago that is what their Pythagorean projection predicted. Now it is a far more sane 81 games! Despite the Brewers, Reds and `Stros motoring to a 5-2 start, the Cubs will win the division with 101 victories. The Cardinals will have to settle for second spot while winning a very respectable 91 games, and our current leaders will be battling it out for the valueless 3rd spot.

The NL West was the sick man of baseball in 2005. Only the Padres had a winning record and this season they are on course to storm home with a whopping 107 losses. What hope does the rest of the division have? Actually, and surprisingly, the NL West projects to return to winning ways in 2006. Both the Rockies and the D-Backs will cruise to over 100 wins, both securing playoff spots! The Giants will also post a respectable record and could be in a position to surge if and when Bonds comes out of his early season slump.

So what have we learnt from this exercise? Not a whole heap I'd suggest. But the next time you are watching baseball and you favourite TV analyst makes some preposterous comment on a tiny sample size just think back to this article and how many games the Rockies were projected to win after just one and a half weeks! Amen.

As a postscript, this article dovetails very nicely with an excellent piece which Sal wrote on Pythagorean projections and the Weibull distribution. You would almost think that we co-ordinate things at this blog!