Which hot and cold starts are for real?

Christian Petersen

We use Rany Jazayerli's method to regress early-season results and find which hot and cold starts are most believable.

May 1 is an important date for baseball fans. As the book closes on April, the Small Sample Size Brigade starts to recede, the weather in the Northeast becomes less hostile, and certain fan bases start panicking about their underperforming team. As a Red Sox fan, this is a tradition that dates back to my childhood, and one that I assume I will pass on to my children, like my father and his father before him.

Of course, the more rational baseball fan will realize that there are still plenty of games remaining for teams to regress to their true talent level, and Scott Lindholm's recent series showed that even the hottest starts don't necessarily guarantee a playoff berth. But how to balance the results from previous seasons with the handful of games already played?

When he was still writing for Baseball Prospectus, Rany Jazayerli developed a simple system to make rest-of-season predictions. The system is a weighted average of a team's results over the past few seasons and their results over the first few games. As the season progresses, more weight is assigned to this season's games, and less to the assumed "true talent level" derived from previous seasons.

The system consists of two equations. First, a team's projected winning percentage P is given as

P = .1557 + (.4517 * X1) + (.1401 * X2) + (.0968 * X3),

where X1 is a team's winning percentage from last season, X2 is a team's winning percentage from two years ago, and X3 is a team's winning percentage from three years ago. Jazayerli determined that previous seasons had no predictive power for the current season.

If we think of the constant term as the weight given to regression towards .500, our projection is made up of the following components:

Rany_hot_starts_18707_image001_medium

This estimates a team's record using previous seasons alone. To incorporate this season's results into a final prediction Y, we use the following formula.

Y = P + ((S - P) * (.0423 + (.0095 * G)))

Here, our team has a winning percentage of S in its first G games.

Now that we've established our method, let's see if it's still accurate by using some out-of-sample data. I compiled the standings from the past three seasons on May 1, June 1, and July 1, used Jazayerli's formula to estimate each team's final winning percentage, and compared the estimate to their final results. First, the mean error, in terms of wins:

Date Avg Delta
May 1 0.003120673
June 1 0.017173118
July 1 0.0208955

This means that, over the past three years, this method overestimates a team's performance by an average of three-thousands of a win, though as we will see, there are plenty of missed calls. Since the means are around 0, this suggests there is no obvious bias: Jazayerli's method is neither consistently overestimating nor underestimating team performance over the rest of the season.

These scatter plots show the distribution of errors over time. We can see that, as the season progresses, the expected winning percentages begin to fall more in line with the final results. This method explains a respectable 42 percent of the variation in final records as of May 1, then increases to 55 percent as of June 1 and 69 percent as of July 1.

May_june_july_eval_medium

This method will never be perfect, of course. The following table shows the worst predictions at the end of the last three Aprils. We can see that, even though the system has no overall bias, it is still capable of producing large errors for individual teams. The teams on this list provide good insight into some of the problems with a regression-only approach: this system does not account for personnel moves, it does not try to predict breakout performances, and -- being regression-heavy -- it will tend to force very good and very bad teams back towards .500 regardless of actual talent.

Worst May 1 Predictions, 2011-2013

Year/Team Games May 1 WP Pred WP Final WP Gms +/- Pred
2011 Diamondbacks 27 .444 .446 .580 21.8
2011 Tigers 28 .429 .478 .586 17.5
2013 Indians 25 .480 .463 .568 17.1
2011 Brewers 27 .481 .490 .593 16.8
2012 A's 25 .480 .478 .580 16.5
2012 Reds 22 .500 .501 .599 15.9
2012 Nationals 23 .609 .510 .605 15.4
2012 Orioles 24 .625 .493 .574 13.2
2013 Pirates 28 .571 .503 .580 12.5
2012 Angels 24 .375 .483 .549 10.8
2012 Indians 21 .524 .489 .420 -11.1
2012 Astros 24 .417 .421 .340 -13.1
2013 Giants 28 .571 .554 .469 -13.8
2012 Rockies 23 .478 .484 .395 -14.4
2011 Twins 27 .333 .481 .389 -14.9
2012 Red Sox 23 .478 .524 .426 -15.9
2013 White Sox 26 .423 .487 .389 -15.9
2011 Rockies 26 .654 .552 .451 -16.4
2011 Marlins 26 .654 .548 .444 -16.9
2011 Astros 28 .393 .455 .346 -17.7

To better appreciate these pitfalls, consider the 2012 Red Sox, whose final results were 16 games worse than what was predicted on May 1. The chart below tracks their expected and actual winning percentage by date. We see that, even as the team went 12-19 in their first 31 games, Jazayerli's method still expected the team to play .500 ball based on the strength of their past few seasons. And they actually did, hovering right around the break-even point until early August. Of course, once August hit, half the team got injured, the other half got traded to the Dodgers, and the front office started trying to get Bobby Valentine to say his name backwards in the hopes of sending him back to his home planet.

Rany_hot_starts_2244_image001_medium

Next, let's look at last season's Pirates, a much happier example. On May 1, 2013, the model predicted just over 81 wins for the Pirates, 13 under their final total. Here, the graph shows a season-long skepticism that melted as the Pirates continued to win. When the team was just under .600 at the end of April, the model projected the Pirates to finish at .500. But as the team continued to win, the model gradually warmed to the team, until by the All-Star Break, it too estimated the Pirates as a .600 ballclub*.

Rany_hot_starts_29935_image001_medium

* - We can also see from both of these graphs that, around mid-July, the model falls apart, as too much weight is placed on the difference between the projected and actual record. But if you're still hoping your team is a regression candidate in August, it's probably time to start thinking about next year.

Now that we trust the model, and understand its flaws, let's see what it thinks of this year's hot and cold starts. Here, the Delta column represents the difference between a team's current winning percentage (as of this morning) and its predicted winning percentage. Note that the projected record is rounded to the nearest game.

Team W L '14 WP '13 WP '12 WP '11 WP P Y Proj W-L Delta
MIL 20 8 .714 .457 .512 .593 .491 .560 91-71 -.154
ATL 17 9 .654 .593 .580 .549 .558 .586 95-67 -.068
OAK 18 10 .643 .593 .580 .457 .549 .578 94-68 -.065
DET 14 9 .609 .574 .543 .586 .548 .564 91-71 -.045
SFG 17 11 .607 .469 .580 .531 .500 .533 86-76 -.074
NYY 15 11 .577 .525 .586 .599 .533 .546 88-74 -.031
NYM 15 11 .577 .457 .457 .475 .472 .503 81-81 -.074
WSN 16 12 .571 .531 .605 .497 .528 .542 88-74 -.029
LAD 15 12 .556 .568 .531 .509 .536 .542 88-74 -.014
COL 16 13 .552 .457 .395 .451 .461 .490 79-83 -.062
KCR 14 12 .538 .531 .444 .438 .500 .511 83-79 -.027
TEX 15 13 .536 .558 .574 .593 .546 .543 88-74 .007
LAA 14 13 .519 .481 .549 .531 .501 .507 82-80 -.012
STL 15 14 .517 .599 .543 .556 .556 .544 88-74 .027
BAL 12 12 .500 .525 .574 .426 .514 .511 83-79 .011
MIN 12 12 .500 .407 .407 .389 .434 .452 73-89 -.048
PHI 13 13 .500 .451 .500 .630 .490 .493 80-82 -.007
CHW 14 15 .483 .389 .525 .488 .452 .462 75-87 -.021
BOS 13 14 .481 .599 .426 .556 .540 .522 85-77 .041
MIA 13 14 .481 .383 .426 .444 .431 .446 72-90 -.035
SDP 13 16 .448 .469 .469 .438 .476 .467 76-86 .019
TOR 12 15 .444 .457 .451 .500 .474 .465 75-87 .021
CIN 12 15 .444 .556 .599 .488 .538 .510 83-79 .066
SEA 11 14 .440 .438 .463 .414 .458 .453 73-89 .013
TBR 11 16 .407 .564 .556 .562 .543 .502 81-81 .095
CLE 11 17 .393 .568 .420 .494 .519 .480 78-84 .087
PIT 10 16 .385 .580 .488 .444 .529 .487 79-83 .102
CHC 9 17 .346 .407 .377 .438 .435 .409 66-96 .063
HOU 9 19 .321 .315 .340 .346 .379 .361 58-104 .040
ARI 9 22 .290 .500 .500 .580 .508 .434 70-92 .144

Not surprisingly, the model favors those teams off to incredibly slow and incredibly fast starts to regress back towards .500, although even after the regression, the model still expects the Braves to win around 95 games and the Diamondbacks to lose over 90.

. . .

All statistics courtesy of Baseball-Reference.

Bryan Cole is a featured writer at Beyond the Box Score and is just now thawing out from watching a game at Wrigley Field last week. You can follow him on Twitter at @Doctor_Bryan.

X
Log In Sign Up

forgot?
Log In Sign Up

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

By becoming a registered user, you are also agreeing to our Terms and confirming that you have read our Privacy Policy.

Join Beyond the Box Score

You must be a member of Beyond the Box Score to participate.

We have our own Community Guidelines at Beyond the Box Score. You should read them.

Join Beyond the Box Score

You must be a member of Beyond the Box Score to participate.

We have our own Community Guidelines at Beyond the Box Score. You should read them.

Spinner.vc97ec6e

Authenticating

Great!

Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.

tracking_pixel_9351_tracker