clock menu more-arrow no yes mobile

Filed under:

Lineup optimization on a macro scale by using composite types

What do the composite batting orders for the American and National Leagues look like, and what does that tell us?

Scott Rovak-USA TODAY Sports

Here’s some insider information about Beyond the Box Score: an article first published in March 2009 still regularly gets 100 page views a day. It’s about lineup optimization. Indeed, I first discovered Beyond the Box Score by searching for information about lineups so as to mock my favorite team’s manager for being a dunce—a tried and true pastime.

I found what I was looking for. But recently, it has seemed that more and more MLB managers are adhering to the principles suggested in the article and the book (The Book) the article summarizes. In particular, more great batters are hitting second, while slappier bats have been moved down the order. Additionally, Joe Maddon and Walt Weiss, among others, have regularly hit their pitchers eighth in 2015.

To get a sense of lineup optimization on a macro scale, I looked at FanGraphs’ batting order splits for the American and National League to create two ideal type lineups. By ideal type, I mean a lineup that doesn't actually exist but embodies the typical characteristics from a larger body of information. Then, using the composite wRC+, BB%, and K% for each split, I identified a player proxy for each spot in the order, again keeping to the National and American League. On the whole, lineup production for the split more closely resembles a conventional lineup.

Through Saturday’s games, the NL’s wRC+ is 93. The top five lineup spots for the NL are above that mark, while the bottom four are below. The composite lineup follows convention more than optimization. The two most productive spots are three and four. The first two slots have essentially the same wRC+. Interestingly, the lineup slot with the highest walk rate, a trait usually associated with the leadoff spot, is third. NL leadoff batters do have the lowest strikeout rate, though the difference is just a percentage point. There’s a steep drop at the six spot—in fact, aside from the part of the order dominated by pitchers, sixth has been the soft spot of NL lineups.

Lineup position wRC+ BB% K% Player Proxy
1 102 7.5 17.3 Marcell Ozuna
2 103 7.4 18.2 Zack Cozart
3 131 10.4 18.8 Adam Lind
4 116 8.8 19.6 Howie Kendrick
5 98 7.1 18.8 Neil Walker
6 81 6.8 20.8 Matt Kemp
7 91 6.9 19.8 Jason Heyward
8 83 7.7 22.8 Ian Desmond
9 26 4.8 30.6 Jason Hammel

My initial reaction is that the NL lineup is full of middle infielders. Another lesson is that the typical player production for any given spot in the lineup doesn’t necessarily inhabit the spot in actual baseball games. The primary reason is likely due to context. Why would Marcell Ozuna, who has Dee Gordon as a teammate, hit leadoff? He doesn't; Ozuna hasn’t hit leadoff once this season. Zack Cozart, somewhat infamously, does hit second fairly often, but he also hits eighth when he’s not hitting second—that paradox embodies much of the debate about lineup optimization. Adam Lind mostly hits cleanup rather than third, while Neil Walker usually hits second or cleanup. Matt Kemp is having a rough year and is hitting like the weak spot in an NL lineup, but he actually hits third most of the time. Jason Heyward and Ian Desmond, bringing up the final two non-pitcher spots of the lineup, are usually deployed as the second batter for the Cardinals and Nationals.

The AL’s wRC+ through Saturday is 99. The composite hitting production for the AL is more evenly distributed. There is no spot with a wRC+ in the 120s, let alone the 131 wRC+ found in the NL's third spot (and it's not due to Bryce Harper; he usually hits fourth). The top six spots in the AL lineup have a wRC+ over 100. The best spot for production is, like the NL, the third spot. Like the NL lineup, walks tend to come with the best hitters at the third and fourth spots rather than leadoff. The second, fourth, and fifth spots are almost interchangeable in terms of production. The drop from slots seven through nine is pretty dramatic and more consistent than in the NL. The ninth spot, in particular, is pretty bad. In a way, the 60 wRC+ produced from the nine-hole is an argument against the designated hitter rule. Pitchers at least have the chance to be comically bad at hitting, whereas the typical ninth hitter on an AL team has enough ability to give the impression of batting competence without adding much offense.

Lineup position wRC+ BB% K% Player Proxy
1 103 6.8 18.4 Kevin Kiermaier
2 113 8.6 17.7 Trevor Plouffe
3 118 9.5 16.8 Brian McCann
4 113 9.6 19.6 Kole Calhoun
5 113 7.5 20.8 Manny Machado
6 107 8.2 20.3 David Freese
7 78 6.5 21.9 Adam Eaton
8 82 7.3 21.7 Luis Valbuena
9 60 5.5 22 Asdrubal Cabrera

Of the player proxies, only Kevin Kiermaier and Kole Calhoun are regularly slotted in the positions given, and Calhoun hits first as much as he hits fourth. Trevor Plouffe usually hits fourth and hasn’t hit in the second spot once this season. Brian McCann usually hits lower in the order, fifth, as does David Freese. Manny Machado and Adam Eaton are down in the order here, but they both hit first most of the time—Eaton exclusively. Luis Valbuena and Asdrubal Cabrera also generally hit high in the order, second and third, respectively, even though they produce like bottom of the order bats.

The American and National League splits by batting order position indicate that, on the whole, managers still more closely adhere to conventional lineup construction. That’s not necessarily a blanket critique, as an optimized lineup really makes the difference of only a few runs a season. While the composite lineups don’t reflect optimization, the fact that the weakest production in the NL comes from spots six through nine, while slots seven through nine in the AL are the most unproductive, at least illustrates that the most vulnerable hitters are where they should be.

The player proxies that accompany the composite numbers point to something else. Namely, the players are kind of boring. Adam Lind is the best hitter of the bunch, Machado is probably the biggest name, and Jason Heyward still has the most potential, but there are no true superstars listed above. The production, however, almost has to be close to middling, and the names that go with them almost have to be boring, because they are composites of a large amount of outcomes from players with varying ability.

The ideal type production from each slot of the batting order, and the proxy players, are what make the best players in baseball so compelling. It’s the teammates of the players listed above, and not the players themselves, that draw us to the game. The lineups above are not bad, but they aren’t that desirable either. But they provide the context for what is desirable—players that break the mold.


Eric Garcia McKinley is a contributor to Beyond the Box Score. He writes about the Rockies for Purple Row, where he is also an editor. You can follow him on Twitter @garcia_mckinley.