There are some who believe the significance of the impact made by the sabermetric community that is outside of the game is facing unprecedented challenge due to a growing divide between the data available to those on the inside and those restricted to publicly available resources. Jack Moore recently penned an essay on the history of sabermetrics, touching on this topic along the way.
In my mind, two areas that are still largely underdeveloped, even with the resources that we already have, are injury prevention and pitch sequencing. Certainly some important work has been undertaken for each of these topics, but it does not feel like we have as complete an understanding or widely accepted metrics for even discussing these two areas as we do with most others in baseball in 2013.
On the pitch sequencing front, I feel that there are two most prominent obstacles standing in the way. The first is the definition itself of what constitutes pitch sequencing. What aspects of a pitch are important to consider? How many pitches are relevant? Can we limit a sequence to a plate appearance, or even a game, or series? Once the first is tackled, there is another problem waiting right around the corner. This is the problem of trying to capture all the relevant information in a manner that can be presented such that the data is digestible. A single pitch is already a multi-dimensional event, with a velocity, movement, 2D location and outcome. When pitches are lined up in a sequence, the amount of data to represent grows rapidly.
With all that being said, I'm not going to do more that scratch the surface on this topic, as a proper treatment would require a great deal of thought and much more work. I wanted simply to answer a question that I've wondered for a while now, which is what sequence of pitches are most commonly used to strike out a hitter?
To do so I've taken all plate appearances that have ended in strikeouts in 2013 (up until just before the All-Star Game), and have broken down the pitch sequence, based on raw GameDay pitch classifications, that finished off the batter. Of course as usual I should remind you that pitch classification is also a challenging problem, so we are taking these tags at face value. I'm also examining this in a cumulative way, so it would make sense for more commonly thrown pitch types to float to the top.
Here are the most common final two pitches to strike out hitters.
|Pitch 1||Pitch 2||Count|
|Four-seam Fastball||Four-seam Fastball||2859|
|Two-seam Fastball||Two-seam Fastball||606|
Most common 2-pitch sequences ending in strikeouts, 2013
Back-to-back fastballs are the most common, which is not surprising given that the relative majority of pitches around the league are four-seam fastballs. I am somewhat surprised that sliders have distanced themselves as much as they have from curveballs and changeups as the secondary pitch of choice, at least as they pertain to the strikeout. Slider usage is actually robustly higher, with 60970 sliders (15.6% of all pitches) having been thrown during this period of study, as opposed to 41718 changeups (10.7% of all pitches) and 36252 curveballs (9.3% of all pitches).
Drilling down to the individual pitcher level, here is the leaderboard for strikeouts driven by a particular couplet of pitches.
|Pitcher||Pitch 1||Pitch 2||Count|
Most common 2-pitch sequences ending in strikeouts by individual pitcher, 2013
One observation here is how many of these sequences feature the same pitch type back-to-back. Justin Masterson has the top pair with unique pitches, with his sinker-slider combination. It is impressive that some relievers made this list given that it is a counting stat in this case. Despite many fewer batters faced, I suppose they have the advantage in this study of having fewer pitches in their arsenal with which to split up their strikeout pitch pairs. Ervin Santana is the only pitcher to have made the list twice, with each entry culminating with his nasty slider.
Notably missing from the list is New York Mets phenom Matt Harvey, who leads the National League in strikeouts at this juncture. Harvey has a diverse strikeout repertoire, topping out with 27 strikeouts with FF-FF sequences, followed closely by 23 by way of FF-SL and another 23 with FF-CU. We can say that he likes to setup his knockout pitch with a fastball!
We saw above that when considering just the last two pitches of strikeouts, we got a lot of cases where the same pitch type was thrown for each pitch. Perhaps this is too zoomed in, so let's examine the most common final three pitches of a strikeout for comparison.
|Pitch 1||Pitch 2||Pitch 3||Count|
|Four-seam Fastball||Four-seam Fastball||Four-seam Fastball||1656|
|Four-seam Fastball||Four-seam Fastball||Slider||737|
|Four-seam Fastball||Slider||Four-seam Fastball||475|
|Slider||Four-seam Fastball||Four-seam Fastball||414|
|Four-seam Fastball||Four-seam Fastball||Curveball||397|
|Two-seam Fastball||Two-seam Fastball||Two-seam Fastball||305|
Most common 3-pitch sequences ending in strikeouts, 2013
All heat is still far and away the most common strikeout sequence. The slider is back again as well, as would be expected now from the results of looking at two pitches.
Let's look at the leaderboard for individual pitchers now and see if there are more "interesting" sequences now that we've expanded to include a third offering.
|Pitcher||Pitch 1||Pitch 2||Pitch 3||Count|
Most common 3-pitch sequences ending in strikeouts by individual pitcher, 2013
Only three of the top 17 sequences (top 15 and ties) had pitches classified differently within the three pitch sequence. It would appear that when we distill this down to the level of the individual pitcher, they tend to generate strikeouts most often by repeating the same kind of pitch rather than mixing up their offerings. On one hand, this makes some sense, as Shelby Miller's fastball or Mariano Rivera's cutter or Sergio Romo's slider are their best pitches, so one would expect their "best" sequence of pitches to consist of merely their best pitch thrown over and over. On the other hand, I had expected to see more "complicated" pitch sequences show up here as well, in particular by looking at three pitches, with the belief that throwing the same type of pitch repeatedly may not be the best way to fool big league hitters, especially for starting pitchers.
Of course pitch location is not considered here, nor velocity for that matter. I think this is precisely what these results are telling us indirectly, that pitch location is a vital part of pitch sequencing. This is not surprising, as hitters surely know they will be seeing cutter after cutter from Mariano Rivera, yet have never figured him out. This means the pitch type on its own is not solely responsible for baffling hitters at the plate. Of course it would not be that simple! That location makes an impact on whiffs has also previously been shown by BtBS Hall of Famer Glenn DuPaul in two previous articles.
Clearly the study of pitch sequencing is much more involved than this. I looked at one aspect (pitch type) of pitches within one plate appearance (and only up to the last 3 pitches at that) that produced one outcome (strikeout) in one half-season. Even with these variables fixed, I could have extended to look at the sequences as a rate statistic relative to the frequency with which the individual pitchers use these pitches overall.
While the results of this study may have been predictable to you, I have to say that I had not expected that the three pitch sequence filled with the same pitch type would be by far the most common, nor had I realized how much the slider had taken over the league compared to the curveball and changeup!
Here is a Google doc that has the full list of 2-pitch and 3-pitch sequences for individual pitchers from 2013, if you're interested in finding your favorite pitcher. Only those sequences that occurred at least four times are included in the tables.
You can follow me on Twitter at @MLBPlayerAnalys. Follow @MLBPlayerAnalys
Credit and thanks to Baseball Heat Maps for PITCHf/x data upon which this analysis was based.