Foolish Consistencies
No matter what anybody tells you, baseball is a game that is based in numbers. The only thing that separates the performance analysis community from mainstream analysis is which numbers we look at. If you're James Click, you're poring over VORP reports and WXRL spreadsheets. If you're the average fan, you're looking at AVG and HR. And if you're John Kruk, you're still paying attention to pitcher wins and losses (has Randy Johnson gotten to 30 wins yet?).
The only problem baseball has with numbers is that there are too many of them. In order to get a better handle on things, analysts often talk in terms of gross averages; sometimes it is the only way to illustrate a point without burying oneself in information.
And yet, as baseball fans, we often talk about consistency and balance, as if we recognize, even if dimly, that there is some sort of value in predictability. This manifests itself in a lot of ways, but there's one idea in particular that I am going investigate today.
Let me ask you a question: would you rather have an offense of one premiere offensive player and a bunch of average hitters or a deep offense of above average hitters?
I'll be more specific. You can choose between these two lineups:
STAR AND SCRUBS EVEN STEVENS
1 Scrub 270/333/433 Steven 277/340/452
2 Scrub 270/333/433 Steven 277/340/452
3 Star 333/400/600 Steven 277/340/452
4 Scrub 270/333/433 Steven 277/340/452
5 Scrub 270/333/433 Steven 277/340/452
6 Scrub 270/333/433 Steven 277/340/452
7 Scrub 270/333/433 Steven 277/340/452
8 Scrub 270/333/433 Steven 277/340/452
9 Scrub 270/333/433 Steven 277/340/452
Team averages: 277/340/452
(Stats are AVG/OBP/SLG)
Star and Scrubs has Vladimir Guerrero or Manny Ramirez batting third in a lineup filled with otherwise league-average hitters (league average defined as 2004 AL). Even Stevens hitters are all exactly the same - no superstars, but all solid. The mean AVG, OBP, and SLG are the same for each team. Which team will score more runs? Which team will win more games?
Which offense would you want?
To answer this question, I've employed my lineup simulator, described in this article (by the way, my program needs a cool name, so please leave suggestions in the comments). In a nutshell, it is a decent predictor of offensive output that does not take into account secondary effects such as baserunning, opposing pitcher, matchups, and "clutchness." I'm not totally sure, but based on what I've seen, my program tends to underestimate offense.
I used the program to simulate 50,000 games and distilled the results into several forms. Let's see the what my computer spit out:
STAR AND SCRUBS: 5.043 runs per game average
EVEN STEVENTS: 5.042 runs per game average
For reference, a team of all scrubs scored an average of 4.713 runs per game
Hmmmm...not surprising. Two teams with exactly the same AVG, OBP, and SLG will, on average, score the same number of runs. I can hear Marc Normandin now: "You said you would go beyond averages. You're off the team!" Don't fire me just yet, Marc. Let's look at the run scoring distribution for these two teams:


(I was naughty and forgot to label the axes. The x-axis is the number of runs scored in a game and the y-axis is the frequency of that event.)
You can see that the Star and Scrubs team has almost the exact same run distribution as the Even Stevens team. It would appear that having an elite offensive player does not change the distribution of run-scoring in games. This surprises me, as I would have guessed that a premier offensive player having a good game can help you pile up runs very quickly and avoid shutouts. But the balanced attack is just as effective at putting up a crooked number or avoiding a shutout. The Even Stevens are as consistent as the Star and Scrubs: both score the same number of runs on average, and both distribute their run scoring in the same way.
What does it all mean? For one thing, it is important to remember that a team's run scoring distribution can effect their winning percentage. In the 2004 AL:
Runs Scored Winning Pct
0 .000
1 .080
2 .190
3 .332
4 .477
5 .607
6 .694
7 .739
8 .831
9 .904
10 .913
11 .965
12 .980+
That is, a team that scored exactly seven runs in a game won almost 74% of the time in the 2004 AL, averaged over all pitching staffs. Two teams can score the same number of runs on average, but have different records based on their particular run distributions. When I originally started this study, I figured that one type of team would have a different run distribution than the other and that it would result in a different winning percentage. That appears not to be the case.
Talking heads will often tell you that an offense needs to be deep in order to be consistent. I used to agree - now I am not so sure that I do. Maybe, just maybe, using gross averages to describe team offense does not hide as much information as we thought.
(I am in no way claiming that run distributions are not important - I have a lot of upcoming work on the topic. I am saying that lineup construction may not have a huge effect on run distribution.)
It also makes me wonder whether I've been too hard on the Anaheim Angels. I've said several times that having only one offensive threat - Vladimir Guerrero - would hurt their offense. And yet for all my tooth-gnashing, they score nearly as much as a team without an offensive standout such as the Oakland Athletics. (I apologize for using the Angels and A's as examples so often - it's just that I'm a fan of one of the teams, and for two teams that have such disparate philosophies, they sure have similar records).
From a risk-management point of view, having the deeper offensive team is a better way to construct an offense; in the event of an injury to your star, you're out of luck. Salaries also seem to increase non-linearly with offensive output, and the money saved by getting out from under a Manny Ramirez contract and replacing him with a couple of Grady Sizemores and Coco Crisps may allow you to upgrade that pesky middle relief or the back end of the rotation. I don't think Theo Epstein is nuts to attempt to dump Manny Ramirez, despite Manny's cartoonish offensive stats. In fact, if the Red Sox ever did dump Manny's contract, I think they could be quite dangerous with that free money (if spent correctly).
Expect more from me on run distributions, especially as the season ends and league data is made available through Retrosheet.
0 recs |
5 comments
Comments
The fact that the run distributions are so...
by Richard Wade on Sep 22, 2005 1:10 PM EDT reply actions 0 recs
Good question
by salb918 on Sep 22, 2005 1:26 PM EDT up reply actions 0 recs
Just checked it out
Position Avg Runs Wins, weighted by run-distribution
1 5.06 84.0
2 5.05 83.9
3 5.03 83.6
4 5.07 83.9
5 5.04 84.0
6 5.04 83.6
7 5.04 83.6
8 5.04 83.6
9 5.04 83.7
The deltas are probably within the error of my simulations, but if you want a take-home lesson, it's to bat your star in the first five positions.
by salb918 on Sep 22, 2005 5:17 PM EDT up reply actions 0 recs
Lineup Construction Baffles Me
I dunno. I'm not convinced by the numbers on lineup construction. I've read all the work done and it seems to make sense, and yet, doesn't. Lineup construction doesn't make a difference.....really? Are we sure?
by cephyn on Sep 22, 2005 1:37 PM EDT reply actions 0 recs
Next week
It baffles me, too - how can it not matter? I'll also move the star player around in the lineup and add/subtract scrubs and stars to see what happens.
by salb918 on Sep 22, 2005 2:00 PM EDT up reply actions 0 recs














BtB on Facebook














