Part of the development of sabermetrics has been spurred by taking old-school type "truisms" and putting them under the microscope to see if they hold up against close, data-driven scrutiny. The sacrifice bunt, lineup construction, etc.
While the aforementioned tactical actions involve a decision from a coach, other manager, or player, the idea of rhythm is not something that can be decided upon consciously. Getting into a rhythm is, more or less, something that just happens. However, given the fleeting nature of rhythm, it is often said as something of paramount importance for good performance. It's time to put this to the test.
What does "getting into a rhythm" mean?
In order to test "getting into a rhythm", we have to define what it means. It's generally used by broadcasters during games and managers in post-game interviews. Loosely defined, "getting into a rhythm" means...what, exactly? To be honest, I don't really know. It's almost like one of those concepts that you simply absorb into your brain over time becoming fiat in your own mind. For the purposes of this article, I think "getting into a rhythm" can be defined as a player settling into his baseball routine and becoming comfortable at an appropriate interval. For a batter, this might mean his pre-pitch ritual; for a pitcher, it might be his actions during time between pitches; it's a vague term. Something that might disrupt a batter's rhythm might include a pitcher taking a long time to choose a pitch while something that might disrupt a pitcher's rhythm is a batter stepping out of the box, calling time, adjusting his gloves, etc.
This loose definition is not suitable for research; a more focused definition is needed. For the purposes of this research, pitcher rhythm will be defined as the pitcher's pace. I will not focus on a batter's rhythm. The average pace of a pitcher can be found on FanGraphs, but that is but one number. For research to look at correlations with outcomes, a pitch-by-pitch pace is needed. That's what I'll use - pitch-by-pitch pace = "rhythm". Each pitcher has his own rhythm. In order to compare pitchers on the same baseline, I'll use the standard deviation of a pitcher's pace to represent "getting into a rhythm". As a pitcher gets closer and closer to his preferred pace, the standard deviation will be lower. This seems like an acceptable definition.
Manipulating the data
The basic idea is simple. Grab a sample of pitches, calculate the pace for each pitch, and correlate with outcomes to see if there's anything worth deeper investigating. The execution of this basic idea is not so simple.
Here's what I ended up doing: I ended up using Baseball Savant. In order to limit the sample to a reasonable number of rows, I grabbed all pitches from only June 1st through June 8th of this year. This was more than 20,000 pitches. At that point, there were a few things that had to be done. The particular field I used to calculate the seconds between pitches was the TFS Zulu field, which is a date/time stamp. I brought the Baseball Savant export into Tableau for data manipulation because the TFS Zulu field does not come in a form Excel can recognize as a date (as far as I know - I'm not an Excel wizard). Using Tableau, I was able to accomplish these things:
- Calculate the number of seconds between pitches
- Eliminate the first pitch of each plate appearance for each pitcher
- Eliminate the first pitch of each inning (by accomplishing #2)
I then exported the data back to Excel, did a little more minor manipulation, and then brought the data back into Tableau. I was left with 19,700 pitches from 382 pitchers. Chris Sale was the leader with 153 pitches; his two starts within the time period examined here were in the thick of his double-digit strikeout masterpiece run. Matt Barnes had the fewest pitches with three.
Right around pitch 44 is when that downward trend at the beginning stops. At pitch 40, Burnett had the same average pace as at pitch 120 (24.88 seconds). I'll use 40 pitches as my cutoff then. This left the sample as 15,298 pitches.
The average pace was 23.4 seconds. The median pace was 21 seconds. Unfortunately, there are a few pace records that skew the data; the largest pace record in the sample was from Lance McCullers, who had to wait 38 minutes between pitches to Alexei Ramirez due to a rain delay on June 8th. The next largest pace record was from Jimmy Nelson, who had to wait 28 minutes between pitches to Starling Marte due to a rain delay on June 8th. With those two pitches eliminated, the average pace goes down only to 23.2 seconds. Interestingly, both those pitchers had decent games despite being interrupted by a rain delay.
After eliminating those two pitches, the standard deviation was 11.2 seconds, our friend A.J. Burnett's standard deviation was 11.3 seconds. The lowest standard deviation belonged to Joe Blanton, whose 4.5 second standard deviation over 44 pitches signifies a solid routine. Noted quick-pace pitcher Mark Buehrle had a standard deviation of 8.2 seconds over 57 pitches; that ranked 41st out of 196 pitchers. Bud Norris had the highest standard deviation of 32.4 seconds over 61 pitches (not surprising to anyone who has endured a Bud Norris start).
To correlate the standard deviation of pace with an outcome, I chose whiff rate. Theoretically, when a pitcher is "in his rhythm", command is better, and getting more whiffs should be a result of that. This is arguable. There are other outcomes that could have been chosen (Strike percentage for example. Or linear weights. Or release point variation. Or other things.). Below, I show a scatterplot of the standard deviation of pace with whiff rate with a trend line added using Tableau Public 9.0's "Trend Line" functionality.
Though there is a slight downward trend of decreasing whiff rate as the standard deviation of pace increases, the trend is not significant. The confidence interval around the trend gets pretty big as the standard deviation of pace gets away from the cluster around 10 seconds. There is also a large amount of variation in whiff rate - from Alexi Ogando at 0 percent to Carter Capps at 28.3 percent. At this point, with this sample of data, I'm not willing to draw any conclusions.
I don't think this is a reason to stop looking into pace. This was such a small sample of data that wasn't really random, but I needed a smaller sample to establish the methodology in an expeditious manner. In the future, I will look at a larger sample of data while also looking at other outcomes. Now that the methodology is established, it should be easier to manipulate a larger sample of data.
. . .
Kevin Ruprecht is the Managing Editor of Beyond the Box Score. He also writes at Royals Review. You can follow him on Twitter at @KevinRuprecht.