Kevin Dame recently debuted his "Paintomatic" system of pitch selection visualization at Hardball Times. One of the findings that he reported as surprising was the fact that his fastball--all 72 mph of it--rates as "filthy" according to FanGraphs' pitch run values. The answer to why this is, of course, is context: a pitch's value depends on how often does he throw it, and what it is matched up against. The question this raises, of course, is the whether these pitch run values have any real meaning at all in an absolute sense.
They might not. This semester I'm teaching a baseball class, and one of the assignments a few weeks back was to read MGL's treatise on pitch selection that aired at FanGraphs last fall. His argument was essentially that the value of a pitch depends on how often you throw it. Throw it more often, the hitter comes to expect it and be able to react to it, and so the value of the pitch declines. Throw it less often, the pitch becomes more valuable. Overall, MGL argues, you should optimize your pitch selection such that the overall value of each pitch is equal.
As an exercise, I had my class go through the process of setting up a simple game theory model that lets us test the logic of this idea. Since pitch selection and game theory are topical right now, I thought I'd post what we came up with and see what folks think of it.
Because we were basing this on MGL's article, we used Brad Lidge as our case study. I told them that Brad Lidge's slider rates as a "75 pitch" on the scouting 20-80 scale, whereas his fastball was a "50 pitch." I then asked them to then fill in a table comparing outcomes based on batter expectation vs. pitcher pitch selection. Here's what they came up with:
Lidge Throws | |||
Fastball | Slider | ||
Batter Expects | Fastball | -3 | 3 |
Slider | 1.5 | -1.5 |
The values are unitless, and are just chosen (by the students in this case) to give us something to work with. Think of them as a fraction of a run if you wish. Also, we're ignoring situations in batters "react" to a pitch, assuming that it will be somewhere in between these extreme situations.
So, if Lidge throws a slider, but the hitter guesses fastball, the hitter is essentially hopeless against it. Therefore, that combination gets a +3 value for Lidge, which we defined as the best possible value. On the other hand, if a hitter guesses slider and gets a slider, we said that he'd still be able to do something effective with this pitch: hence the -1.5 value. Fastball mirrored this effect, except that the fastball is a less good pitch, and therefore the best- and worst-case scenarios for the fastball are less good than the best- and worst-case scenarios for the slider, respectively.
Fair enough?
Relating the table to pitch frequency
Assuming random pitch expectation on the part of the batter, it makes sense for Lidge to throw nothing but sliders: better good-case outcome (+3), and better bad-case outcome (-1.5) than the fastball. On average, we'd expect average value of the slider to be (3-1.5)/2=+0.75.
However, if Lidge throws nothing but sliders, the batter is not going to keep guessing randomly. He's going to expect a slider every time. This shifts the value of the slider from +0.75 or so down toward -1.5 units. The same thing would happen if he tried to throw fastballs all the time. The relationship between pitch selection and pitch value can therefore be summarized in this figure:
The fastball's line is shifted downward because it's not as good of a pitch (in a vacuum). But as before, if you're throwing a slider all the time, it's value drops. And, if he's only throwing two pitches, as slider frequency increases (and its value decreases), the value of the fastball will increase because you are throwing it less.
Note on a big assumption here: I have no justification for assuming a linear change in pitch value. It very well could be extremely nonlinear, and may vary from pitcher to pitcher or pitch type to pitch type. I have no idea how to figure this out, short of doing an experiment in which I dictate pitch selection to a pitcher and monitor run values of different pitches. ... obviously not happening any time soon.
Optimizing pitch selection
So, how do you optimize pitch selection? There is probably an elegant mathmatical solution to this, but we took the brute force method: for any given combination of Fastball vs. Slider pitch frequencies, we multiplied the number of pitches of a particular type by its value (we did everything per 100 pitches). So:
Overall Pitch Selection Strategy Value = [#Fastballs * FastballValue] + [#Sliders * SliderValue]
Fastball and Slider values change with the frequency that the pitches are thrown per the graph above. This will allow us to find the strategy in which we minimize our negative consequences and maximize our positive consequences, an approach (as I understand it) known as MiniMax.
If we do this for each possible strategy, based on the (arbitrary) values and assumptions with which we set up our problem, we get the following distribution:
Pitch value is the horizontal axis here, while # fastballs thrown is the vertical (it matches the table I'll show below). Number of sliders would be [100 - #Fastballs]. What you can see is that the best overall value, based on all of our assumptions, comes when you throw the fastball 40 out of 100 times (40%). This means that you're throwing the slider 60% of the time. At that pitch selection strategy, the overall value of your 100 pitches is +6 units.
Here's a table showing all the pitch selection strategies we ran, the run value of the pitches involved, and the total values of the selection strategy:
% Pitches Fastballs | Value Per Fastball | Value Per Slider | Fastball Value per 100 Pitches | Slider Value per 100 Pitches | Total Value of Pitch Selection Strategy |
100 | -3.00 | 3.00 | -300 | 0 | -300 |
95 | -2.78 | 2.78 | -264 | 14 | -250 |
90 | -2.55 | 2.55 | -230 | 26 | -204 |
85 | -2.33 | 2.33 | -198 | 35 | -163 |
80 | -2.10 | 2.10 | -168 | 42 | -126 |
75 | -1.88 | 1.88 | -141 | 47 | -94 |
70 | -1.65 | 1.65 | -116 | 50 | -66 |
65 | -1.43 | 1.43 | -93 | 50 | -43 |
60 | -1.20 | 1.20 | -72 | 48 | -24 |
55 | -0.98 | 0.98 | -54 | 44 | -10 |
50 | -0.75 | 0.75 | -38 | 38 | 0 |
45 | -0.53 | 0.53 | -24 | 29 | 5 |
40 | -0.30 | 0.30 | -12 | 18 | 6 |
35 | -0.08 | 0.08 | -3 | 5 | 2 |
30 | 0.15 | -0.15 | 5 | -11 | -6 |
25 | 0.38 | -0.38 | 9 | -28 | -19 |
20 | 0.60 | -0.60 | 12 | -48 | -36 |
15 | 0.83 | -0.83 | 12 | -70 | -58 |
10 | 1.05 | -1.05 | 11 | -95 | -84 |
5 | 1.28 | -1.28 | 6 | -121 | -115 |
0 | 1.50 | -1.50 | 0 | -150 | -150 |
Here's something interesting: based on this model and all of the assumptions going into it, the optimal pitch selection for Lidge (40% fastballs, 60% sliders) is NOT one that would have the pitch value of each pitch be exactly equal. That point (or the closest to it, as I just did 5% increments) is 35% fastballs, 65% sliders. The optimal strategy is very close to that point, but not exactly there. The reason is that, because in this case you're throwing sliders more frequently than fastballs, the extra boost to slider value that you get by throwing it slightly less often is worth more than the penalty you get by throwing your fastball slightly more often.
Obviously this conclusion is very dependent on our assumptions. The aforementioned issue of whether there's a linear change in pitch value with pitch frequency could be a big deal. And, of course, if you change the values in our initial table, the optimal pitch strategy changes.
Other scenarios
Actually, that last issue is sort of the point, right? What if I change the table to this:
Lidge Throws | |||
Fastball | Slider | ||
Batter Expects | Fastball | -3 | 3 |
Slider | 3 | -3 |
This is essentially giving a situation in which the two pitches are exactly equal in quality.
If you do that, here's the distribution:
This indicates that if both pitches are of equal value (in best- and worst- case scenario conditions), that he should throw each exactly 50% of the time. Nice check, right? And yes, in this case, pitch values in the optimal strategy are exactly equal.
Let's do one more case, returning to Tim Wakefield and his 72 mph fastball:
Wakefield Throws | |||
Fastball | Knuckleball | ||
Batter Expects | Fastball | -5 | 2 |
Knuckleball | -2 | -1 |
The way I've set this up is that the fastball is ALWAYS a poorer pitch than the knuckleball (this may or may not be true, but it makes for an interesting example, so bear with me). Even if a hitter is guessing knuckleball in this case, the fastball is so bad that you STILL get a better outcome by throwing a knuckleball instead of a fastball. The fastball's just that easy to hit. The value of the two pitches can NEVER be equal. Based on this, I expected that it would always be correct to throw the knuckleball, because you wouldn't want to expose your horrible "batting practice" style fastball to the hitter.
Well, let's plug it into the model:
This really surprised me. While the model certainly doesn't think that Wakefield should throw many fastballs, you still get a better overall outcome by throwing them ~15%-20% of the time! The reason? The increase in value to your knuckleball offsets the damage done by exposing your horrible fastball. At 15% fastballs/85% knuckleballs, the value of the fastball is -2.45 units, while the value of the knuckleball is -0.55 units. Definitely not equal. But this is where you get the best overall payoff.
Where do we go from here?
This is obviously not a very sophisticated model. It's basically a classroom exercise, designed primarily just to help students understand MGL's article. But based on the above work, I'll put forth the following conclusions (acknowledging that they are contingent on all kinds of assumptions that may break down upon closer examination):
- The fact that pitch value declines the more often you throw a pitch is key to understanding optimal pitch selection strategy. This, I think, was MGL's main point, and it absolutely stands.
- Pitch run values reported at FanGraphs are highly context dependent, and probably should not be used to judge the actual "quality" of a pitch. At least, not in an absolute "scouting" sense, without knowledge of pitch frequency and other pitches.
- It may not be the case that your best outcome occurs when pitch values are exactly equal. It will probably be close to this point, but it may be that you will adjust so that good pitches are thrown a bit less often than the "all pitches equal" point to bump up their value. This may explain why, for many pitchers, their "best pitches" (according to scouts) do often to have slightly positive run values while poorer pitches have lower run values. Maybe.
- Even if your best pitch's worst case scenario (hitter guesses it) is better than the best-case scenario of your secondary pitch (hitter guesses wrong), it still is probably worth it to mix in the occasional bad pitch to keep hitters from sitting on your best pitch too much. Even Mariano Rivera mixes in the occasional four-seamer (there's misclassification at that link, but see his horizontal vs. vertical movement plot--still some four-seam fastballs in the upper-left quadrant).
Can this model be made more realistic? Probably. We could certainly extend it to more than two pitch types. And we could try to get the inputs such that it is actually predictive. But I"m not sure how one would do that, as I'm relying on data that I don't think are measurable from game data. You could probably assume optimality on the part of the players and try to "work backwards" to get decent input values for pitches. But even then I'm not sure it would work. As a test of logic, though, I think this simple little model works pretty well. I'll be happy to hear your feedback and would welcome attempts to further this line of work, as I may not have time.
Also, if you're interested in playing around with it, here is the spreadsheet. Please just drop me a line and let me know if you use it to do something interesting. :)
Update: Based on discussion with David and Matt below, it looks like I missed a key point in not allowing batter and pitcher optima to vary independently--I more or less assumed that the batter would just latch onto whatever the pitcher was doing. That's ok, as the main points of this article hold. I think it still should be a nice way to help folks understand the relationship between pitch frequency and pitch value, and how that plays into pitch selection. But it is the case that conclusion #3 does not hold, because the way I was calculating optimum pitch strategy is flawed. Thanks to those guys for helping explain this stuff, as I understand it better now than when I started! :)