/cdn.vox-cdn.com/uploads/chorus_image/image/11894361/167105776.0.jpg)
April: the opening month of baseball and a clean slate for each player to start the season anew. With everyone being scrutinized so closely, it's hard to remember that there will be more than 130 games remaining on the schedule once the month concludes. What's the value of an April stat line? A .150 AVG or a 36 wRC+ might not look so good now, but there is a good chance that things will even out by the end of the year. Same goes for a hot start.
While perusing player profiles, I couldn't help but think: "Is John Buck's .448 wOBA a mirage or a snapshot of a great season?" Well, this poses an interesting question:
Just how much predictive value is in a player's April performance?
Are those who get off to hot starts likely to exceed the expectations set out for them before the season began? Are those who get off to slow starts inversely affected?
It is important to remember that we cannot expect John Buck to continue hitting to a .448 wOBA, but it's reasonable to think that he may be somewhere in between his April production and his ZiPS projection of .310. The answer is often times somewhere in the middle. But, lets not leave it at that.
Lets first look at the relationship between April wOBA and the ending wOBA:
While the relationship looks strongly positive, the r^2 is only at 6% -- hardly significant at all.
While in theory, a hot start could increase the probability of a strong season, the variables that exist within the rest of the year are simply too much for April stats to be a significant predictor. For instance, the d(wOBA) or the differential wOBA in all other months had an R^2 of 80% -- which is not saying a lot as we know the majority of the sample size will account for a large amount of variance that exists in the end totals.
So if we can't look at wOBA with confidence in its consistency, what can we appreciate as fact, rather than undefined?
Just because wOBA, and most everything else early on, has little predictive value, that does not mean it is valueless.
What if we project the different potential outcomes for one-thousand different simulated seasons?
Using a statistical method called "Bootstrapping" we can pull one-thousand different outcomes for each metric of PA, AB, HR,1B, 2B, 3B, BB, IBB, SF, HBP -- all the wOBA inputs. Doing so, we can calculate all the different percentiles of the outcomes, mapping the range of the best and worst outcomes, and the most probably destination.
For our sample we will take the subject's ZiPS projections because they are independent of their hot starts, this being our controlled variable. We will then find the projected WAR totals for each percentile.
Then we will add to the sample, their April statistics, finding the percentile of the WAR projections. This is where we find the value of their hot starts. Just how much of their surprising starts add to the probable value they will end up providing when it is all said and done?
How much more likely is John Buck to post a three-WAR season, and Bryce Harper a ten-WAR season, based solely on what we know now rather than what we knew before the season started?
Well, lets jump in and find out:
Simulation and Prediction
John Buck came into the season with ZiPS projecting a measly .310 wOBA with a wRC+ of 96. Using his ZiPS projections, we have created a matrix of the counting statistics that act as the inputs of wOBA. This way we are mapping the range of outcomes possible for Buck based on the opinion of the baseball community. "wOBA before" will represent the ZiPS bootstrapping totals in the chart below. Meanwhile, we will input Buck's current April counting statistics into the matrix, introducing the new realm of outcomes given his surprisingly good start. Doing so, we are creating a balance between actual and expected that shall serve the basis for our thousand-season simulation.
[NOTE: our WAR calculation is a crude way of converting wOBA into wRAA and wRAR, then we factor in replacement and positional adjustment. Meaning this is solely an offensive evaluation.]
JOHN BUCK:
ZiPS:
Name G PA AB H 2B 3B HR HBP SB CS wOBA WAR
John Buck
102
383
343
80
15
1
15
3
0
0
0.311
1.7
SIMULATION w/ APRIL DATA:
Precentile | wOBA before | WAR | wOBA after | WAR | DIFF |
---|---|---|---|---|---|
0% | 0.248 | 0.34 | 0.26 | 0.68 | 0.34 |
25% | 0.302 | 1.91 | 0.316 | 2.55 | 0.64 |
50% | 0.317 | 2.35 | 0.331 | 3.05 | 0.70 |
75% | 0.332 | 2.78 | 0.348 | 3.62 | 0.83 |
100% | 0.391 | 4.50 | 0.406 | 5.55 | 1.05 |
On average, John Buck's hot April makes his expected WAR total at the end of the season jump nearly a sixth of a win. The best case senario for Buck is that he produces nearly 5 WAR -- but keep in mind that the 100% is unlikely reached and the most realistic outcome for Buck is the 50% percentile at 2 WAR.
JUSTIN UPTON:
ZiPS:
Name G PA AB H 2B 3B HR HBP SB CS wOBA WAR
Justin Upton
140
592
519
138
27
3
24
7
17
7
0.353
3.3
SIMULATION w/ APRIL DATA:
Precentile | wOBA before | WAR | wOBA after | WAR | DIFF |
---|---|---|---|---|---|
0% | 0.282 | 0.01 | 0.295 | 0.74 | 0.73 |
25% | 0.343 | 2.76 | 0.357 | 3.83 | 1.07 |
50% | 0.359 | 3.48 | 0.375 | 4.73 | 1.25 |
75% | 0.376 | 4.25 | 0.393 | 5.63 | 1.38 |
100% | 0.451 | 7.62 | 0.458 | 8.87 | 1.24 |
No one expected Justin Upton to get off to such a hot start, as he has arguably been the best player in baseball for the start of the season. You can see the range of his ZiPS projections seem modest compared to what he is expected to do now based on his recent production. On average, Upton was around a 4.40 WAR player with his new April data in the mix. But it also seems he has added at least a win into his early season projections.
CHRIS DAVIS:
ZiPS:
Name G PA AB H 2B 3B HR HBP SB CS wOBA WAR
Chris Davis
128
507
463
121
24
1
25
5
2
2
0.341
1.1
SIMULATION w/ APRIL DATA:
Precentile | wOBA before | WAR | wOBA after | WAR | DIFF |
---|---|---|---|---|---|
0% | 0.279 | -0.66 | 0.288 | -0.26 | 0.401 |
25% | 0.333 | 1.43 | 0.349 | 2.39 | 0.956 |
50% | 0.349 | 2.05 | 0.367 | 3.18 | 1.126 |
75% | 0.367 | 2.75 | 0.384 | 3.92 | 1.169 |
100% | 0.439 | 5.53 | 0.448 | 6.70 | 1.172 |
Chris Davis arguably had the best first four games anyone has had in the history of this game. That alone should be a good sign that he can knock his original ZiPS projections out of the water. A full win is expected to be added on top of his original range of values.
JED LOWRIE:
ZiPS:
Name G PA AB H 2B 3B HR HBP SB CS wOBA WAR
Jed Lowrie
82
317
280
75
18
1
10
2
2
1
0.346
1.9
SIMULATION w/ APRIL DATA:
Precentile | wOBA before | WAR | wOBA after | WAR | DIFF |
---|---|---|---|---|---|
0% | 0.262 | 0.71 | 0.314 | 1.19 | 0.48 |
25% | 0.335 | 2.01 | 0.364 | 2.67 | 0.66 |
50% | 0.351 | 2.39 | 0.379 | 3.10 | 0.71 |
75% | 0.368 | 2.83 | 0.394 | 3.57 | 0.74 |
100% | 0.448 | 4.55 | 0.452 | 5.27 | 0.72 |
Jed Lowrie was a steal for the Athletics, as it appears early on. The only problem with Lowrie is his inability to stay on the field. And while it would great to assume he stays healthy all year -- these ranges were computed with Lowrie's expected PA totals of 317 and then 390. In the end, Lowrie seems to be on track for at least a 3 WAR season.
CARL CRAWFORD:
ZiPS:
Name G PA AB H 2B 3B HR HBP SB CS wOBA WAR
Carl Crawford
96
399
369
102
19
6
10
3
19
6
0.331
2.2
SIMULATION w/ APRIL DATA:
Precentile | wOBA before | WAR | wOBA after | WAR | DIFF |
---|---|---|---|---|---|
0% | 0.274 | 0.60 | 0.284 | 0.62 | 0.02 |
25% | 0.323 | 2.08 | 0.335 | 2.32 | 0.24 |
50% | 0.339 | 2.59 | 0.349 | 2.91 | 0.32 |
75% | 0.355 | 3.07 | 0.364 | 3.46 | 0.39 |
100% | 0.415 | 4.90 | 0.433 | 5.57 | 0.67 |
Coming into the season, no one knew what to expect from Carl Crawford. His play on the field has been encouraging to say the least, prompting scouts to share that his times to first are as fast as his early years with the Rays. Coupled with his hot start, we should expect a nostalgic performance from Crawford this year. The uptick in actual to projection is minuscule; but the simulation thinks that Crawford is around a 3 WAR player in 2013.
ZiPS:
Name G PA AB H 2B 3B HR HBP SB CS wOBA WAR
Bryce Harper
136
575
512
144
26
6
26
2
18
7
0.37
4.6
SIMULATION w/ APRIL DATA:
Precentile | wOBA before | WAR | wOBA after | WAR | DIFF |
---|---|---|---|---|---|
0% | 0.288 | 1.24 | 0.299 | 1.90 | 0.66 |
25% | 0.355 | 4.19 | 0.355 | 4.62 | 0.43 |
50% | 0.371 | 4.90 | 0.373 | 5.48 | 0.58 |
75% | 0.386 | 5.53 | 0.390 | 6.29 | 0.76 |
100% | 0.454 | 8.51 | 0.466 | 10.01 | 1.50 |
Oh boy. Who knows -- do we have Mike Trout-like rookie production in Harper's sophomore year? While Harper would not like that comparison, Mike Trout remains the benchmark for all 10 WAR players for a long time. Harper seems on track for a best case senario of at least 10 WAR, but our simulation predicts he will most likely land in the 5 - 6 WAR range. Pretty good for a 20 year old. Either way there is no mistaking that Harper is in for a monster 2013.
Big thanks to Stephen Loftus for his Bootstrapping expertise and Fangraphs for their wonderful ZiPS data.
You can contact Max Weinstein at @MaxWeinstein21