Just after Easter, the Atlanta Braves won two games of a three-game set against the Miami Marlins at Turner Field, lead by strong performances by Alex Wood and Aaron Harang (14 combined IP, 2 ER, 22 K). Less than a week later, Braves' pitching was bruised to the tune of nearly eight runs per game, and Wood and Harang were victimized for 16 ER and 20 hits in less than 10 innings of work.
After the Braves left Miami, Carroll Rogers of the Atlanta Journal-Constitution raised the possibility that the Marlins were getting an unfair home-field advantage by stealing signs. Fredi Gonzalez said in the article that he changed signs five times during one game, and had his charges scouring the bleachers for evidence of cheating. Rogers added,
Gonzalez said he wondered if [catcher Evan] Gattis was giving away signs, but then figured that wouldn't account for the disparity in the Marlins home/road splits in general. Entering play Thursday, the Marlins were hitting .307 while scoring 6.3 runs per game at Marlins Park. They were hitting .215 with 2.7 runs per game on the road.
At the time of the article, the Marlins had played 16 games at home and a mere 12 games on the road. For an individual player, we know that 12 games is far too small of a sample to judge that player's performance reliably. So when do team statistics stabilize?
Russell Carleton at Baseball Prospectus has done the heavy lifting on this topic, using split-half reliability to show when a player's true talent level rises above the noise level in a number of hitting and pitching rate statistics. For example, Carleton showed that a player's batting average (BA) only stabilizes towards his true talent level after 910 at bats (AB).
But a team can get that many AB in a month. Intuitively, if a player can suffer through a bad month at the plate, why can't a team? The Marlins may have approached the Mendoza line on the road in April, but that includes a 10-for-53 stretch by Giancarlo Stanton. Odds are, they're not as good as they looked in Miami to open the season, and they're not as bad as they looked away from Miami either.
Carleton's articles depended on the Kuder-Richardson reliability formula to determine when a player's BA became a reliable indicator of his true talent level. It's not clear to us, however, how to adapt this method for a situation where Adeiny Hechavarria hits in front of Stanton. Players with vastly different true talent levels will have different variances and different numbers of AB. We decided instead to approximate the Marlins' performance by a very rough Monte Carlo simulation.
A player with a true-talent BA of b can be modeled as a binomial random variable. After n AB, we expect this player to have an bn hits. In programs like Excel, this can be easily simulated by selecting a uniformly distributed random number using the rand() function. If the random number is less than b, we count that as a hit; if not, we count that as an out. This can be repeated any number of times. If different values of b are selected, we can simulate players with different true-talent BA and create a lineup.
For this article, we made a rough approximation of the Marlins' lineup, including their platoons, as shown in the table below. The lineups and playing time were based on how the Marlins have actually used these players so far in 2014. Each player's career BA was assumed to be their true talent level.
|POS||STARTER||BA||PLAYING TIME||BACKUP||BA||PLAYING TIME|
|C||Jarrod Saltalamacchia||.246||75%||Jeff Mathis||.195||25%|
|2B||Derek Dietrich||.214||67%||Jeff Baker||.267||33%|
Note that the playing times were assumed to be independent random variables; that is, choosing Jeff Baker to play second had no effect on whether Jeff Mathis played catcher. We then created four different lineups, one for each combination of players, as detailed below.
|No.||Lineup #1 (P = 50%)||POS||Lineup #2 (P = 25%)||POS||Lineup #3 (P = 16.7%)||POS||Lineup #4 (P = 8.3%)||POS|
For each game, we randomly selected one of the four lineups, and drew random numbers to simulate AB until 27 "outs" were recorded. At this point, the game was completed, a new lineup was selected, the batting order was reset, and we started again. In total, we completed 25,000 "seasons" of 300 games each. (The extra games were selected to ensure we would have enough AB for our statistic to stabilize, since Carleton's work showed that individual BA needed more than one season to stabilize.) These boxplots show the simulated BA and number of AB each player accumulated across all 162-game samples.
But how will we know when the team average has stabilized? First, we need to estimate the team's true average. This can be done by multiplying each player's playing time by his BA and dividing by nine:
BAMarlins = [(% playing timepitchers)(BApitchers)+(% playing timeSaltalamacchia)(BASaltalamacchia)+...+(% playing timeStanton)(BAStanton)]/9
The "true" average by this method is .243.
Now, we know that an individual's BA stabilizes after 910 AB. Here we have the results of our simulation for Saltalamacchia by AB, since his average of .246 is closest to the team's average of .243. We can see that, at 910 AB, Saltalamacchia's extreme values are around 50 points away from his true-talent average, and 68 percent of his simulation results are in a 29-point range around his true average.
Here we have the same result for the entire Marlins team. Surprisingly, we see that the team BA reaches the same spread as Saltalamacchia after a mere 23 games! If we assume an average of 40 AB per game, the Marlins will have collected around 920 AB after 23 games, right at the mark Carleton found significant for individuals. At this point, the extreme simulated BA are 107 points away from the true team average, and 68 percent of the results are in a 29-point range around the true average.
Note that we did not estimate this statistic in terms of AB. This is because the best-hitting team in our sample and the worst-hitting team completed drastically different numbers of AB, producing discontinuities and a noisy sawtooth pattern.
You may have noticed that, after 16 games, our team BA has a maximum value of .315, not far above the .307 Miami hit in their first 16 home games. This does not suggest the Marlins are actually cheating. We have made so many simplifying assumptions that our simulation is in no way representative of the Marlins' expected 2014 performance. Here are just a few of the huge things we have hand-waved away to simplify our approach:
- Home-field advantage
- Pitcher quality
- Pinch hitters
- Baserunning mistakes
- Double plays
- Teams leading after 8-1/2 don't play the bottom of the ninth
And this is not an exhaustive list; the reader is invited to pick out their own favorite overlooked wrinkle*.
* - But, y'know, keep it to yourself.
The purpose of this was just to estimate how quickly this specific statistic stabilized. Future research can focus on estimating how quickly other statistics become reliable indicators of a team's performance using a similar Monte Carlo approach, or it can use a more mathematically rigorous approach to improve on these estimates.
. . .
All statistics courtesy of Baseball-Reference. Special thanks to Russell Carleton for answering my questions about his methodology.
Bryan Cole thinks a team that wears bright orange as regularly as the Marlins can never "look good at home." He is a featured writer for Beyond the Box Score. You can follow him on Twitter at @Doctor_Bryan.