So back in May, I started a series of articles looking at trying to adjust pitching runs allowed for the quality of the offense the pitcher faces. Later that week, Bryan Grosnick and Blake Murphy discussed the article in the Shameless Plugs segment of our Beyond the Box Score podcast. The full discussion occurs around the 20 minute mark of the podcast, but I want to focus in on a comment that Blake made toward the end of the segment.
Hopefully the next step for him is to do this on a component basis instead of just saying this team is really good at scoring runs or this team really is bad at scoring runs.
So that's where I've decided to go with this. In a sense, we'll look at adjusting the events that a pitcher can control: walks, hit batters, home runs, and strikeouts. It happens that these statistics make up FanGraphs FIP calculation, so in addition to the adjusted components, we can get an Opponent Adjusted FIP and WAR.
Why Do We Need To Adjust For Opponent?
We all know that sports, life — or more accurately scheduling — aren't fair. In college football, discussions of quality of opponent can help determine which teams make the National Championship Game. In the NFL, fans were furious when a 7-9 Seattle Seahawks made the playoffs by winning against an inferior division. And in baseball, we'll hear complaints about how the unbalanced schedule affects the wild card race.
It seems intuitively clear that we need to take into account who pitchers have faced during the season. A pitcher who faces a team of Miguel Cabrera or Joey Votto clones would fare much worse than if they faced a team of Alcides Escobar or Adeinny Hechavarria. It's tempting to say that all this averages itself out over time, and over an extended period of time, it might. But over the relatively brief season where a pitcher may face 1,000 batters, differences still exist.
Now, raw FIP doesn't take this into account, and xFIP doesn't take individual batters so much into account. And while the pitching WAR that is calculated from FIP does take into account league run averages and park factors, individual lineups faced isn't there either. So, in a sense, pitching WAR is the wins above a replacement player given that the replacement player faced the same schedule as the pitcher in question. But if we're comparing pitchers based on their WAR, we need to adjust for their competition to truly compare based on WAR.
That's what we'll attempt to do below. We'll assign a certain amount of blame to the pitcher based on the event and the opposition faced. Essentially, we'll cut get down to seeing how many walks, strikeouts, home runs, etc. the pitcher really allowed when the blame is taken into account. Then from this we'll calculate oaFIP and oaWAR (Opponent Adjusted FIP and WAR) in a manner very similar to the current method.
How Do We Assign Blame
This would be the "gory math" section. Feel free to skip to the results if you are so inclined.
So, in order to assign the "blame" to a pitcher or hitter, we first need to start by laying some groundwork. For the sake of argument, we'll look at strikeouts only, but this exact hierarchy is applicable to walks, hit batters, and home runs. Now, for each pitcher-batter combination, the number of strikeouts in n at bats can be viewed as a binomial distribution. However, we are trying to determine whether the pitcher or batter should receive credit for the strikeout, so both the pitcher and batter have binomial distributions that they are able to control. So, in a sense, it is a mixture distribution of the form
P(K) = π Bin(n, θPitcher) + (1-π) Bin(n, θBatter)
where π is the probability that the blame goes to the pitcher, and θ is the strikeout rate for the pitcher or batter.
So, we need to estimate π, the probability of blame going to the pitcher, for every pitcher-batter combination that a pitcher faces. In order to do this, I decided to go the Bayesian route and create a full hierarchical model and run the MCMC associated with this. I won't explain the full thought process behind this, although I can recommend a few texts for interested people (Here for a simpler first introduction, and here for a more math-heavy look).
The full hierarchical model is given below, with the explanation of parameters given at the end.
Ki | γi ∼ γi Bin(ni, θPitcher) + (1-γi) Bin(ni, θBatter,i)
γi | π ∼ Bin(1, π)
θPitcher | αP, βP ∼ Β(αP, βP)
θBatter,i | αB,i, βB,i ∼ Β(αB,i, βB,i)
Generally, we set π = 0.5, meaning that a priori (Before seeing the data), we don't have any information about whether the pitcher or batter receives credit for the strikeout. Also, we need to set the α and β parameters for pitcher and batter. For this, we set α = K% × 20 and β = (1 - K%) × 20 where K% is the K% for the pitcher or batter (Depending on which α and β we are calculating). This is to ensure that the expected value for θ is equal to the K% a priori, while simultaneously limiting the variability of θ.
In order to estimate all this, we run a Gibbs sampler on the hierarchical model, and we can estimate πi* by finding the proportion of draws where γi = 1. From this, we adjust the strikeouts allowed by taking Ki/(πi*/0.5), making our KOA values.
Finally, to convert these components to FIP and WAR, all we need is the FIP constant. This constant will be different than the traditional FIP constant, set to be 3.048 for 2013. However, it is calculated in the exact same way, in the difference between our calculated league value and league ERA. Here, our constant is calculated to be 3.972, and the oaFIP and oaWAR are calculated in the exact same way as traditionally done.
Okay, we're out of the "gory math."
So, now we want to look at the leaders in oaFIP for 2013, along with the oaWAR. Without further ado, the 2013 leaderboard (For pitchers with 150 IP or more).
|Jorge de la Rosa||3.90||2.1|
So we see some differences between the FanGraphs leaderboard and the one above. But the gain or loss in adjusting for opponent follows a nice distribution, with even numbers on both sides of 0.
The average difference between the two is 0.010, with the largest difference being 0.47. The pitcher who gained the most by adjusting their performance was Dan Haren, while the greatest loss fell to C.J. Wilson. And like with traditional FIP, the season of Matt Harvey leads the way as the best in the majors.
So, this is one form of adjusting for pitcher opponent ability. While it might not perfectly illuminate or eliminate the affect of opponent, this technique can help push towards making a statistic that allows for complete comparison across league and opponent.
. . .