Hey everyone, I'm a regular poster at the Mets SB Nation site, Amazin Avenue. I put this little case study together and thought you guys might be interested since there's some pretty interesting info in here; basically I developed a stat that I feel is pretty useful. So enjoy.
In honor of Oliver Perez week, I figured I would try to wrap up this case study on consistency. For anybody who missed it, I took a shot at quantifying start to start consistency, or what I'm calling Consistency Factor (CONF), for pitchers a few weeks back in this post. It was enlightening but the results just didn't seem quite right. So I've made some tweaks, basically utilizing Bill James' Game Score to evaluate each game started instead of WPA.
A brief refresher for those who missed the original post or forgot how it worked: Basically I evaluated a pitcher's starts individually (using Game Score) and then took a Standard Deviation of those 30-something starts. The higher that #, the higher the range a pitcher regularly pitches within thus the lower his consistency. So lower = better. However, this # does NOT relate to how effective a pitcher is. A pitcher can have the worst CONF but still be very good, all it means is that he doesn't pitch consistently from start to start.
I compiled the CONF #'s from 2008 as well as the last 3 seasons totaled for every qualifying NL starter. However, before we dive into all of those #'s, here's a nice visual breakdown of this idea using, who else, Oliver Perez:
So as you can see, each point represents a start measured in Gamescore (high=good low=bad), the gray area represents 1 Standard Deviation away (in both directions) from Oliver's Mean (which happened to be a Gamescore of 51.65). In english, the blue points are each start, the gray represents the average range where he usually pitches and the red baseline is the level of his average start in 2008.
First of all, Lincecum's Mean or average start is obviously higher (Gamescore: 62.06), the guy did win the Cy Young. (For reference, a Gamescore of about 50 represents Replacement Level) More importantly for us, Lincecum's starts are packed much more densely within his average range than Oliver's. Only a handful of starts fall outside of the gray area whereas Perez has many more fall outside. As a result, Lincecum's Average Range is smaller, which represents less variation and higher consistency.
As far as the most important figure that we can draw from these graphs, its that Standard Deviation I mentioned that makes up the gray Average Range. This is the figure that represents Consistency Factor. For Lincecum it's 13.18. For Perez, it's 17.08. The league average is around 16.
So now that we can kind of visualize where these #'s come from, lets look at the results. I grouped all NL starters (who pitched at least half a season in '08) by team and I've got all the teams here so lets look at them by division, East first of course:
Well there are a few interesting things we can see here:
Onto the NL Central:
And finally the NL West:
And last but not least, here are the rankings of the best and worst performers of 2008:
So I hope this has been enlightening. At the very least I figured it would break up the interminable dross that we've witnessed in the FanPosts recently. Just as last time, I apologize for the high level mathematics and such but thats the breaks.
I set out to disprove the myth of Oliver Perez as "Mr. Inconsistency" and I think I've done that. That title definitely goes to Carlos Zambrano. Perez is definitely on the higher end but even if I included all of the #'s, he's not as bad as the media portrays. I think a lot of that mindset derives from how he can be so inconsistent in-game or inning to inning rather than game to game, which definitely is no myth. That would be another interesting case study but thats another story for another day. As far as what this means to his overall performance, not too much because he's obviously been terrible this year anyway. But at least he's been consistently terrible...