Bruce Chen -- "Infield Fly Master" - Jeff Curry-US PRESSWIRE
While a lot has been made over the eventual inclusion of IFFB in ERA estimators, it is important to first learn how we can adjust this volatile stat by park factors -- giving us a better look at the dimensions in play for the pitcher.
With the recent murmurs that IFFB should be included in ERA estimators because of its similar run value to strikeouts -- many, including myself have found deficiencies in the notion of comparing K% to IFFB%.
For one, IFFB% is affected highly by foul area dimensions and catcher mobility behind the plate. Meanwhile, K% is mainly in the hands of the pitcher and can be viewed as a repeatable skill year-to-year that is not as dependent on park factors.
Not to mention, the year-to-year R^2 is very small at .19 and at an r of .40 for IFFB. We can assume because of the statistical noise year-to-year -- that when a pitcher moves from home park to away park -- his IFFB% would be highly affected by the move; more so than his ability to strike batters out. If we are to use IFFB% in ERA estimators, it would be wise to consider park factors and league averages so that we get beter insight of the pitcher's skill to "pop batters up".
Consider this: When we look at FIP we often look at it in a descriptive and predictive way.
In a descriptive frame of mind, including IFFB% makes sense -- as we are broadening the coverage of skills that the pitcher exhibited season-to-season. But due to the statistical noise year-to-year -- it does not make sense to include IFFB% because it will at least harm its predictability, and perhaps mar its accuracy.
One misnomer about IFFB% is that it is not the percentage of infield fly balls, but the percentage of infield flys per fly-balls. Even as simply as multiplying IFFB% by FB% would instantly create a better metric -- doubling the R^2 of the metric from 19% to 44% -- a very large and substantial improvement which was pointed out in this post by Steve Staude. IFFB%*FB% would yield the amount of infield fly-balls hit per balls in play.
Given this information, lets move on to creating a comprehensive IFFB+ and what we will call IPO+(Infield Pop-Outs) -- which is the statistic will will create from FB%*IFFB%.
Creation of IFFB+ and IPO+
To begin we will take all pitcher seasons since 2002 to 2012 with at least 150 IP, and at least 2 seasons with the team that they are with-- so to localize variations from park to park while creating the statistic:
(IFFB% / 10.4%)*100 = IFFB+
We will calculate the IFFB+ stat just like OBP+ but in order to account for park factors we will do something similar:
((IFFB+) / IFFBpf)*100
(Note: IFFBpf is the park factor)
Now with this park factor proportion we will take the IFFB+ and the park factor proportion and find the average of their sum. Instantly, we have a park-factored IFFB+:
((IFFB+) + IFFBpf)) / 2
Meanwhile we will create IPO+ in the same fashion -- however, we will use IFFB%*FB% instead of IFFB%.
((IFFB%*FB%)/3.7%))*100 = IPO+
The result is unadjusted IPO+, "infield pop up plus". Briefly we will adjust for park factors:
((IPO+) / (IFFBpf))*100
Lastly we add the sums of the park factor and the unadjusted:
((IPO+) + (IPO+pf))/2
Now we have the method to calculate IPO+ and IFFB + with park factors.
Here are the 2012 leaders for IPO+ with the corresponding IFFB+:
Here we see that Bruce Chen, had the best IPO+ last year despite Kaufman Stadium being one of the worst IFFB parks with a factor of 92. For this reason, IPO+ recognizes Bruce Chen doing so in one of the hardest environments in the league and gave him a slight park adjustment boost.
Interestingly enough, Porcello was No. 2 in IFFB+ but 22nd in IPO+ given the 103 park factor at Comerica.
Also, check out R.A. Dickey -- given Citi Field's 108 park factor, Dickey's score was adjusted to match the high propensity of pop flys occurring in New York. The Blue Jays had a park factor of 97 last year for IFFB, so it will be interesting to see how his totals react north of the border.
Now let us look at the top IPO+ season's since 2002 in our data set:
|1||2004||Tim Wakefield||Red Sox||211.07||213.21||212.14||67.54||237.35||236.17|
|11||2003||Tim Wakefield||Red Sox||174.45||176.21||175.33||85.20||193.21||192.24|
|22||2005||Bronson Arroyo||Red Sox||134.93||137.69||136.31||82.10||168.34||166.66|
|23||2005||Tim Wakefield||Red Sox||141.68||144.57||143.12||114.27||167.84||166.16|
|25||2011||John Lackey||Red Sox||159.03||155.91||157.47||126.61||163.51||165.15|
No surprise here that Wakefield is at the top -- But Bruce Chen? Bruce Chen had a terrific IPO season last year, that most overlooked.
IPO+ showed a high year-to-year correlation of 43% -- while IFFB% and IFFB+ had a 19% correlation. Meanwhile, IPO was a better predictor of FIP in Year 2 with a p-score of 0.0328.
So, in conclusion it is better to use IPO+ and IFFB%*FB% in general, if we want to consider the predictive power of IFFB's in ERA estimators.
Being that we use peripherals as not only descriptive tools of how well a pitcher pitched independent of factors that he could not control -- we also use them to look to the future and predict success. In any case, it would seem foolish to mar the predictive power of ERA estimators without first adjusting IFFB's to park factors that affect their variability.
So what do the BtBS readers think about including IFFB in ERA estimators?
Should we include a IFFB metric that is adjusted for park factors and league averages like IPO+?
Or should we include the raw data as to map how the pitcher actually preformed?
You can contact Max Weinstein @MaxWeinstein21 on Twitter