For a while now, I've had an idea in the back of my mind about evaluating pitchers. You all know about the classic divide between FIP-based WAR and RA9-based WAR; the former assumes that all non-HR balls in play are out of the pitcher's control, while the latter basically assumes that they are all entirely under the pitcher's control.
Both systems know that neither one is completely accurate - pitchers have some amount of control over balls in play, but so do fielders. As Tom Tango eloquently puts it:
The truth, like all things, is somewhere in the middle. And the problem is that not only do we not know where that middle is, but the middle changes based on whether you are looking at things at the PA-level, the game-level, the season-level, and the career-level.
That last thought is key here: the middle changes depending on what you're looking at. Over an inning or plate appearance basis, we're going to have a different level of credit for balls in play than if we evaluate from a career level.
The reason for this doesn't lie in the nature of batted balls themselves, but in statistics, and our ability to, in Nate Silver's words, find a signal in all the noise. If we just look at the numbers, how are we supposed to know how much credit or blame to give a pitcher for balls in play on an individual game basis? Sure, over a career, the randomness will generally be overshadowed by talent, but on a smaller sample, there's just no way to know...if we just look at the numbers.
But, as certain critics of sabermetrics would say, "the game is played on the field, not a spreadsheet." When we say that a pitcher isn't responsible for balls in play, we're basing that on the numbers, but those numbers come from real events that actually happened. And in each of those events, there is an answer as to the "responsibility" of each player involved.
We probably can't ever truly determine that answer. But if we watch how each play unfolds, can't we get a little closer? Wouldn't we have a better idea of how to assign responsibility to the pitcher and the fielders if we actually watched the play?
This is where you come in. This is not a task for one person. But if we use the shared judgments of the crowd, we may be able to somewhat accurately measure a pitcher's responsibility for the outcomes of balls in play.
Of course, no one could possibly remember a season's worth of plays, let alone a game. If we're going to assign responsibility on a play-by-play basis, we have to come up with some way to quantify and keep track of our results. While we want to include as many variables as possible, we also want to make this a simple task given the sheer volume of plays that occur every single day.
Here's my thought, and I'd like feedback from you in the comments: to make it simple, let's take both the batter and luck out of the equation. Let's assume that the credit or blame for each outcome must be entirely divided between the pitcher and the fielder(s), say on a scale from 0-100%.
If that is the case, then all we need to do is assign this percent responsibility to the pitcher for every play, and then the fielder will get the remaining responsibility. Theoretically, we can then collect all the results and determine that middle ground mentioned previously between FIP and RA9. But I'll leave that for another day.
For now, I want to look at some plays from the Phillies-Reds game last night and think about how we would assign this responsibility. First of all, I want to make clear the question that we are answering:
For the play in question, assuming all responsibility must be divided between the pitcher and the defense, how much credit or blame should the pitcher receive for the end result of the play?
The following is entirely an experiment to see how difficult it would actually be to answer this question, and how we would go about doing so. This is what I really want your help on. What variables should I be considering? Should I have phrased the question another way? Am I assigning credit and blame incorrectly?
Let's jump in.
This isn't the best play that Ben Revere made last night, but it's still quite a good play. On the one hand, Cliff Lee gave up a long fly ball to the warning track, one that I'm sure he was not happy about, and Revere made a fantastic play on it. On the other hand, it looks like Revere had some time to catch the ball, and though he certainly made a great play, I would think it could have been made by other center fielders as well.
Still, this likely would have been a hit with an average center fielder playing, and Lee certainly didn't do a good job here. And remember, we're assigning responsibility for the outcome, so since the outcome was an out, the number below represents how much credit Lee should get for the out itself.
I included this play to juxtapose with the one above. Here, Lee gives up a similar batted ball to deep right field, but this time the play was easily made. The ball seemed to be hit a bit higher and not quite as hard, as well as closer to the fielder. While Lee likely wasn't happy about the batted ball, almost every right fielder would have caught this ball, so we'll give "credit" to Lee (though maybe it should be blame since a run scored).
These next two plays also go together. In the play above, Brandon Phillips hits a hard ground ball through the hole. It's not a line drive, but it's also well hit. On the other hand, if it was hit a few feet to the left, it would have been an easy double play. This kind of play is tough, and is precisely why we need to assign responsibility on a scale, rather than an all-or-none basis.
Alright, last one. Here's another ground ball, but this time it turned into a double play. Of course, this ball was hit much softer than the one above, but it still could have turned into a run had it been hit in a different place. That being said, Cliff Lee needed exactly a double play ground ball, and he got it. The difference in how hard the ball in Play Three was hit to this one is significant, and leads to much more credit to the pitcher.
Like I said at the before, this was more of a brainstorming exercise than anything else. The idea I've proposed - that we can evaluate a pitcher's responsibility for balls in play based on crowdsourcing - isn't one that could be implemented easily. But in lieu of HITf/x and FIELDf/x data, this may be this best way to do so.
Here's what I want you to answer, if you are so inclined:
1. Is my idea convincing to you? Do you think crowdsourcing responsibility for BIP would increase the accuracy of our pitcher evaluations?
2. How would you rate the plays above using the question I proposed? Are your answers significantly different than mine? If so, why? What did I miss?