Sometimes, sabermetrics is complicated. There's a good reason for that: baseball is complicated. How could we possibly expect to measure player performance and talent without complicated statistics and analysis?
On the other hand, baseball is simple. It is comprised of discrete plays. There are only eight different base states. There are only three different out states. To win you have to score runs, and to score runs, you have to not make outs. Simple.
Because baseball is, in some ways, simple, baseball statistics can also be, in some ways, simple. Sometimes the most interesting statistics are the ones that don't require complex formulas or advanced statistical techniques.
One aspect of baseball that I think deserves both complex and simple analysis is starting pitching. On the one hand, we have FIP and xFIP and SIERA and WAR and what-have-you. These statistics tell us fantastic and interesting things about a pitcher's performance, and what we can expect from him going forward.
On the other hand, we have innings pitched, runs allowed, and (team) wins. These may not be the best predictors of future performance, and they may not be the best ways to measure past individual performance, but they are, in the end, all that really matters for a starting pitcher.
So I'm going to look at starts today. It won't be groundbreaking research. It will just be some graphs and words about the very core aspects of a start.
Let's start with distributions. Distributions may have a connotation of fancy stats, but it's really just measuring how often things happen. I think this is interesting. I'm not quite sure why, but something about seeing how often things happen on a league-wide basis intrigues me.
First, here's what the distribution of runs allowed per start looks like since 2000:
Interestingly, you can see that the most common number of runs allowed in a start is two. And since this is based on all starts since 2000, surely it is even more common recently. At first, I found this somewhat surprising, but since many of these starts went five or six innings, and since there are almost as many 3-run starts, this is probably expected.
Of course, there are some major flaws in the above graph. Namely, it doesn't differentiate between a 5-inning, 3-run start and a 9-inning, 3-run start. Before we try to fix that, let's look at the distribution of innings pitched per start for reference (I now realize that these labels are confusing; "5-6" mean 5 innings and zero, one, or two outs, but 6 innings is binned in "6-7"):
As I expected, the most common length of a start is 6 innings, followed by seven innings and five innings. But again, we want to combine the above two graphs in order to look at the distribution of actual run prevention in starts.
To do so, let's look at RA9, or runs allowed per nine innings, per start, since 2000:
The results look a little strange here, but it's primarily due to the limited number of possible RA9 values, rounding, and the craziness of RA9 number when there are very few innings pitched. Clearly, even though the above graph addresses the problem of inning pitched, it doesn't give us a great idea of the distribution of the actual run prevention of each start.
If we can't use runs allowed or runs allowed per nine innings, how could we measure start effectiveness with regards to run prevention? Well, one option is a variation on Game Score. Game Score, as you may know, rates each start based on a number of criteria, including strikeouts, walks, runs, hits, etc.
However, since we only care about runs, we can adjust Game Score to only take runs and innings as input. Luckily, Tom Tango did just that a few years ago, using this formula:
RA9 Game Score = (6.4 * IP) – (10 * R) + 40
This is a nifty little metric, because it allows us to properly account for both innings pitched and runs allowed in order to evaluate starts, without weird values at the high extreme. Let's look at the distribution of RA9 Game Score since 2000:
Now doesn't that look nice! With that, we can see how often various types of pitching performances occur. For example the most common RA9 Game Score is between 50 and 60, which is about 7 innings pitched and 3 runs allowed, or 6 innings pitched and 2 runs allowed.
But we don't just care about distributions; we care about winning! Well, there's a pretty simple way to visualize winning – and keep in mind I mean team winning, not pitcher winning – with this data. In fact, it's so simple, I don't think I even need to explain it!
Told you it was simple!Again, however, we see the same issues with not taking into account innings pitched. Let's do that. But first, for reference, here is the team winning percentage based on number of innings pitched by the starter (rounded down):
I find that jump between four and five innings interesting. Does that mean the pitcher win is actually on to something with its seemingly arbitrary five inning minimum? Or is it just a matter of sample size? I'll leave that for another post.
Now, let's skip RA9 and just jump straight to RA9 Game Score to see the winning percentage based on the effectiveness of various starts.
As expected, we get a nice curve, reaching a perfect team win-loss record when the game score is over 100. Notice that according to this graph, a starter needs to have an RA9 Game Score of at least 50 or so in order to ensure a greater chance of his team winning than losing.
Based on the formula above, and some trial and error, we can determine that if a pitcher gives up three runs, he must pitch at least 6.1 innings in order to reach an RA9 Game Score of 50.
Interestingly, that almost exactly matches the minimum threshold for a quality start. However, it's also worth noting that pitching four or five (or even two) scoreless innings leads to an RA9 Game Score of greater than 50. On the other side of the coin, pitching eight innings and giving up four runs also leads to a >50 RA9 Game Score, contrary to the definition of a quality start.
Perhaps, then it would be more accurate to have a more fluid threshold for innings and runs in the quality start statistic. A threshold, perhaps, like having an RA9 Game Score of greater than 50!
We can adjust the RA9 Game Score formula a bit, such that a positive number is a "quality start" and a negative number isn't. When we do that, we get this:
6.4*IP - 10*R - 10
That's pretty simple right? If that number is positive, we give the pitcher credit for giving his team a >50% chance of winning. If it's negative, we don't. And since we probably don't want to say that two scoreless innings is a quality start, we can make the minimum five innings pitched, since that's where the winning percentage jumped in the chart above.
EDIT: A little late, but I thought I'd insert this little grid that I made, which shows the RA9 Game Score for each IP/RA combination. Anything in red is a "poor" start, and anything in green is a "quality" start. Keep in mind that these are RA9 Game Score values, and not values based on the adjusted formula above.
See, I told you sabermetrics can be simple! Sure, some might not call this sabermetrics. All I really did, after all, was just make some uber- simple graphs based on games started. But Bill James said that sabermetrics is "the search for objective knowledge about baseball." I searched, and I acquired knowledge in the process. Sabermetrics!
In the future, I hope to do some more actual analysis of games started, using real pitchers and real teams, in order to determine how things like variance and run support affect starts and the probability of winning. But that's a job for another day, preferably not a Friday.
All starts stats taken from Retrosheet. RA9 Game Score formula is from Tom Tango at FanGraphs.