PZR: A replacement for FIP
Ever since Voros McCracken published his original article on DIPS in 1999, sabermetricians have been attempting to isolate pitching from fielding. Tody the most widely used metrics are Tom Tango's FIP and Graham MacAree's tRA; FIP focuses solely on strikeouts, walks, and home runs allowed, while tRA also accounts for batted ball type. Yet possibly the best defense-independent value metric is little-known PZR, or Pitcher Zone Rating.
Skill vs. Value
Before I go any further it is necessary to distinguish between skill and value. Colin Wyers provides possibly the best overview of this issue:
People tend to lean upon defense-independent estimates of pitching performance because they better predict future performance. (And, strictly speaking, they do.) This leads to a lot of fantastic confusion about the issue, with the argument being that if we want to look at past performance, we should ignore defense-independent measures and look at actual results.
This is wrong for the same reason that we look at a pitcher's ERA instead of his win-loss record. A team does not consistently score the same amount of runs every game; thus it is possible for different pitchers, even different pitchers on the same team, to have vastly different amounts of run support. This is not a function of pitching, and the credit or blame for this should not righly be assigned to the pitcher.
It is the same with defensive support. Two pitchers, even two pitchers on the same team, cannot be presumed to have the same quality of support from their defense. Defense-independent pitching statistics seek to give us a way to compare pitchers with different defensive support fairly.
In other words: Given a league-average defense, what would the pitcher's statistics look like (and by extension, how many runs would he allow, and how many wins would he be worth)? Rather than trying to correct for all the "luck" in a pitcher's line, we are simply trying to isolate pitching from fielding.
Enter PZR
The idea behind PZR (Pitcher's Zone Rating) is simple: for each batted ball, the pitcher is credited or debited with a fraction of a run, depending on the ball's location, trajectory, and speed. Essentially, PZR is UZR, but from a pitcher's perspective. As a result, PZR is a poor estimator of a pitcher's true talent level. It doesn't address the fact that pitchers have very little control over the location of a batted ball (apart from pulled, up the middle, or opposite field), nor does it regress line drive rate or HR/FB rate. But PZR is perfect for retrospective valuation. Like linear weights for batters, it evaluates players solely on what actually happened. Unlike FIP or tRA, it attempts to measure a pitcher's value at inducing "fieldable" balls in play. And it is truly "defense independent," as it replaces a pitcher's defense with a hypothetical league-average defense.
PZR is not a new idea. David Pinto came up with it in 2003, and it was discussed extensively at the Book Blog here and here. Also, MGL released PZR for pitchers from 2001-2006 here. However, PZR is not currently available in a daily-updated fashion the way FIP, xFIP and tRA are. The closest metric to PZR is Rally's WAR for pitchers. He uses the same methodology as PZR; however, his fielding statistic, TotalZone, is based on Retrosheet data rather than the BIS data that UZR uses. Furthermore, Rally only updates his WAR at the end of the year.
And so I come to the purpose of this fanpost. If possible, I would appreciate it if someone could post daily-updated PZR to a website. This would allow for the calculation of PZR-based Win Values, similar to the FIP-based Win Values on Fan Graphs. Another possibility is to use the HIt f/x data: if it's updated daily, I could calculate PZR based on HIT f/x parameters and gameday hit locations. This process would be similar to Peter Jensen's skill dependent batting runs, although I would incorporate hit location data and (maybe) an approximation of hang time.
27 comments
|
1 recs |
Do you like this story?
Comments
Interesting
I never knew this actually existed, but it’s a neat concept. Unless I’m missing something though, isn’t is basically the same thing as tRA except on a different scale?
F%#& Billy Beane. Actually... I kinda like Holliday
That's what it sounded like to me.
Rather than batted ball type/strikeout/walk, it’s got the UZR buckets and assigns league average run values for them. Sounds pretty cool, as it actually provides context to the type of batted ball. Guess that would make Kevin Millwood fans happy.
Does it have a component for K’s/BB’s/HR’s? Would that be like tRA’s linear weights values?
Yes, it would also have a component for K/BB/HR
I should have clarified that—the PZR component is only for estimating the fielding-independent outcomes of balls in play.
by Alex Krolewski on Jul 28, 2009 7:25 PM EDT up reply actions
Sure, if you call tRA "FIP with different buckets", too.
PZR + UZR = linear weights.
Beyond the Boxscore Not a member? Sign up.
You don't want to just use runs, by the way
Fractions of runs and fractions of outs are the best way to go because then all of the R/9 calcs drop out and you don’t have to mess around with adjusting IP seperately.
How are you calculating run values? I’m working on something similar with Hit/FX data.
You shouldn't.
What you should do is instead subtract the PZR run values from the actual runs allowed and divide by IP.
I don't see a difference between what you just said and what I suggested
What am I missing?
by Graham MacAree on Jul 28, 2009 9:00 PM EDT up reply actions
You're looking at it like tRA.
Wheras in implementation it’s exactly like UZR or TotalZone or SZR or whatever you like. Alex mentioned using K, BB and HR values, which you don’t need to do at all in order to make this work. tRA is a product of compontent ERA logic, which when you have actual runs and runs attributable to defense you don’t need.
As for how you estimate the run values – start with observed RA, adjust and recurse.
Actually I haven't calculated anything
I was just promoting PZR because I don’t think it gets mentioned nearly enough as it should be, and because it isn’t available anywhere and it should be.
If you construct a PZR-like stat using Hitf/x data, will you make it available on a daily basis at Stat corner?
by Alex Krolewski on Jul 29, 2009 1:17 AM EDT up reply actions
Hit F/X data likely won't be available on a regular basis, at least not at first.
There are hit locations in the current Gameday feed that should suffice. The hard part is parsing out responsible fielders (like so).
Hit locations in gameday feed don't work
They are position of the ball as fielded which doesn’t really tell you much
by Graham MacAree on Jul 29, 2009 10:56 AM EDT up reply actions
Depends on your definition of "work."
And Hit F/X doesn’t give you ANY ball position. It gives you initial speed and direction. What it doesn’t tell you is spin, and without spin, you’re just sort of guessing at where the ball lands. For the purposes of determining landing point, an intelligent use of the Gameday hit location data should be nearly as good as what can be derived from Hit F/X. (Now, there are other benefits to the Hit F/X data, but since none of us have more than a month’s worth the issue is rather moot for the time being.)
Gameday hit location is massively squirrelly on ground balls
I think combining hit fx and gameday is the way to go but you still get a bit lost on caroms off the walls etc
by Graham MacAree on Jul 29, 2009 12:24 PM EDT up reply actions
This is somewhat off topic, but...
Hit f/x gives us the initial horizontal angle of the ball, and gameday gives us the final horizontal angle…so can’t we calculate the ball’s spin based on the difference between these two angles?
by Alex Krolewski on Jul 29, 2009 1:16 PM EDT up reply actions
Assuming Gameday is right, maybe.
But the effect of spin only accounts for the direction of the ball until it hits the ground. It’s inordinately complicated. And that only gives you the horizontal spin, which if you have accurate hit location you don’t really need the spin, at least for ground balls.
For PZR, the "responsible fielder" doesn't matter, right?
All we care about is the defense as a whole (was the ball converted into an out or not) rather than whether or not one particular fielder fielded the ball.
by Alex Krolewski on Jul 29, 2009 1:13 PM EDT up reply actions
Colin, any idea when HIT f/x data will be available on a daily basis?
PZR should be calculated by combining HIT f/x batted ball velocity data with gameday hit location data (or using the BIS data, but it’s not free).
by Alex Krolewski on Jul 29, 2009 2:18 PM EDT up reply actions
It won't be.
Hit F/X data isn’t generated real-time like Pitch F/X data. Sportsvision is currently working on a business model to sell the most updated data to teams, and then at a later point publicly release the data.
You’re right about the BIS data not being free. This definately sounds like something for a place like Fangraphs that already has a significant amount of money invested in the BIS data. (This is in fact why I stopped working on my Gameday-based pitching metric – Fangraphs rolled out UZR and it no longer seemed useful.)
Has MGL given Fangraphs his formulas for UZR?
Without these formulas, calculating PZR would mean creating a new fielding stat from the BIS data, but with them, it should be trivial.
by Alex Krolewski on Jul 30, 2009 2:11 PM EDT up reply actions
In any case, HIT f/x based PZR would be more accurate than BIS-based PZR
(Assuming that the Gameday hit locations aren’t too terribly inaccurate)
by Alex Krolewski on Jul 30, 2009 2:16 PM EDT up reply actions
Which I don't think is a valid assumption.
BIS (or STATS) data should both be better than the Gameday data. (The one question nobody seems to be able to answer is if BIS or STATS is better. This matters if, oh, you want to know how good a fielder Ichiro is.)
That's too bad
That makes using Hit f/x much more difficult in calculating PZR, as you would have to link together the BIS and Hit f/x data.
by Alex Krolewski on Jul 30, 2009 7:26 PM EDT up reply actions

by 


















