I'll be honest, I struggled to buy into some of the ideas of the lineup analysis tool that went up on Baseball Musings today, based on Cyril Morong's work with coefficients here at BTB and Ken Arneson's programming skills from Catfish Stew.
After thinking about it and looking at the coefficients again and playing around with a LOT of different lineups, I think I've come up with a "handy-dandy lineup tip sheet." If I inadvertently crashed the Baseball Musings server, I deny all charges. And that program kept me from finishing up two papers I have due next week. But I think it's worth it.
First off, imagine you have a National League team that looks like this, and your projections for their production look exactly like their lines from last year. Here's your team.
POS - Player (OBP/SLG)
C - Gregg Zaun (.355/.373)
1B - Paul Konerko (.375/.534)
2B - Chase Utley (.376/.540)
SS - Derek Jeter (.389/.450)
3B - Rob Mackowiak (.337/.389)
LF - Jay Gibbons (.317/.515)
CF - Andruw Jones (.347/.575)
RF - Vernon Wells (.320/.463)
P - Pitcher (.200/.250)
There's a little bit of a suspension of disbelief involved, b/c we're going to assume that all the players involved played in the same park. But the specifics are less important than the actual concepts, here. The players themselves just serve to make this easier to write.
First off, what would you expect the standard lineup, from a "conventional" line of thought, to look like?
- A. Jones
Now, for the "stathead's" lineup.
The conventional lineup gets 4.852 runs/game, in the simulator. My "stathead" lineup does a little better, at 4.885 runs/game.
But the best lineup, the optimal one, gets 5.082 runs/game.
So, what the heck is going on here?
When you think about it, the real goal of a lineup is to fit together well, so that each player's strongest skill is most suitable to his position in the lineup. Somewhere along the line, strange roles were assigned to each spot. Then a statistically minded person comes along and says, "these roles are arbitrary! Let's stack the best hitters and do this more intelligently!"
Both sides are partially right.
A lineup IS at its most efficient when the pieces fit well together, and the conventional wisdom was right on a couple of accounts: the leadoff hitter should get on base, and the 4-hitter should hit for power. But a lot of those roles were arbitrary. And while the simulator that I mentioned earlier does not account for speed and "in-game tactics," this is a big deal.
The reason why I'm posting this is because I picked these 8 guys + pitcher, and managed to order them in the most optimal way without the use of the simulator. I used the coefficients, and trial and error, to determine a rough strategy for ordering your lineup. Here's the lineup:
- This is the most OBP-centric spot in the lineup. Your hitter here might very well be your best hitter, IF his best attribute is his OBP. A hitter with a .425 OBP and a .500 SLG would fit in here well, provided that there's not a better OBP threat elsewhere on the roster. When I looked at it, I decided that Derek Jeter is really the optimal leadoff hitter. He has a good OBP and acceptable power, and he's generally a solid hitter.
- The 2-hitter should be the lineup's most balanced hitter, a good combination of OBP and SLG. David Wright fits the bill here, as does the player I chose, Chase Utley. The first guy I thought of was Mike Lowell in his prime, when I looked at the results and coefficients.
- This was the biggest surprise: the 3 hitter should be the player that doesn't fit into any of the other spots. Every other spot has some significance, but if I were building a lineup, I would just put the leftover player in the 3 hole. This seemed very counterintuitive to me when I first heard it, but David Pinto noted, "Part of what it's telling us is that you need to spread out your easy outs." I still struggled to get this, but I'm starting to, now. Marc said something to the effect of "the worst players have to go somewhere." I guess this is really it; the other spots just have greater needs. If you can get a good hitter here, it means that your lineup is very deep.
- This is the bopper. This guy's best attribute should be his power, with OBP being of secondary importance. He should be the foil to the leadoff hitter, in a way; both players could be similar if they're both very complete. Andruw Jones, though, is an ideal #4 hitter: slightly above average OBP, and "phenomenal cosmic power," to quote Aladdin.
- Picking the 5 hitter is simple: it's the second choice for the two slot. Paul Konerko, who I picked for this spot, had a very similar line to our #2 hitter, Chase Utley.
- The 6 hitter shows the biggest difference between SLG and OBP on the roster. This is because you're going to want to have guys driving in the leftovers. The 6 hitter is the most exclusively power-dependent hitter of the bunch. His OBP is VERY unimportant. Alfonso Soriano and Jay Gibbons are good picks for this slot.
- The 7 hitter is the less extreme version of the 6 hitter, with less of a need for power and more usage for OBP. I picked Vernon Wells here.
- This is the worst hitter in the lineup. If it's the pitcher, he goes here, unless it's Dontrelle Willis or Jason Marquis or someone similar. This is because you'd rather not put the pitcher close to two of the best hitters in the lineup: the 1 and 2.
- The 9 hitter should be a "punchless wonder," of sorts. Scott Podsednik, Gregg Zaun, and Brad Ausmus fit into this role nicely: guys with acceptable OBPs and absolutely no power. This is the "stereotypical leadoff hitter" to the extreme. He's not actually leading off because you don't necessarily want these guys to imbibe plate appearances, I think.
Lineup order is not vital, but I've seen simulated teams jump up 3-4 wins with this optimization. I'm not an economist or an Econ major or minor, even, but this looks a great deal like the issue of maximizing utility in economics. Players have strengths and weaknesses, and ordering them in a certain way can help you to maximize those strengths and weaknesses. I guess the next step in this whole model is speed. A caveat I gave to the leadoff man is that I would try and avoid him being a "slow, fat slob" of a baseball player, but that's just out of personal intuition and what seems logical rather than anything proven mathematically. But otherwise, this data looks very interesting. Definitely try out the simulator if you haven't, yet.
Please feel free to leave feedback, to elaborate on rationales for putting a player in a certain spot, or with general criticisms.