Has anyone seen the BP breakdown of the Ibanez deal?
I'm thinking that Sheehan realized that the Davenport Translations aren't very accurate in measuring fielders, because he just went ahead and sited Dewan's +/- system. So, for fun, I checked out BP's own fielding numbers for Ibanez... and to my shock, he graded out as an above average fielder in 2008.
Now, I have a great deal of affection for BP - they got me interested in the power of statistical analysis when I bought their annual publication in 2005. At this point though, I have to agree with some of the recent discussion and beg them to update their metrics.
0 recs |
65 comments
|
Comments
Their Davenport translations work best when you use them to compare historical players, not current ones. Still, they know the limitations of their system and that’s why the writers almost never use their defensive metrics in their articles, usually resorting to Dewan’s +/-.
Webmaster of Driveline Mechanics
http://www.drivelinemechanics.com - An Unconventional Look at Scouting
how do they work better for historical players?
If they don’t work for current players, they don’t work period. If we accept that this system is so bad that it is not useful at all for rating modern players, and a system that rates Ibanez as being “above average” would seem to be THAT bad, then it has no analytical value. Using this system to try to rate historical players will only give you similarly incorrect answers, with the caveat that there is no way to demonstrate that the answers are incorrect.
they can't play baseball, they don't wear sweaters, they're not good dancers, they don't play drums
I meant that they are useful only in a historical context – i.e. using them to compare older players to themselves.
Webmaster of Driveline Mechanics
http://www.drivelinemechanics.com - An Unconventional Look at Scouting
Is your point that DTs are better than nothing and therefore have usefulness when they're the only option?
If so, maybe. But they’re just as accurate in the past as they are today.
Beyond the Boxscore // Calling BJ Upton lazy is lazy.
by Sky Kalkman on Dec 14, 2008 12:16 PM EST up reply actions
And it only gives false certainty if you let it.
No defensive metric really gives me any total certainty.
Of course.
But ever since UZR went up on Fangraphs we’ve been seeing people argue, amongst other things, that left field is harder to play in Safeco than centre because Balentien rated better in CF than LF.
I’m not talking about people who understand how volatile defensive metrics are, I’m talking about the misuse of statistics by folks who don’t understand their limitations.
It's a question of sampling.
The larger the sample, the more reliability you get out of it. There are problems with FRAA, but it’s not some Pancake Flop sort of thing where you’re just chucking in numbers that you think sound good. There are better ways to skin the cat, even using official fielding statistics, but it’s not absolutely useless. No reason to use it when you have TotalZone/OPA/ZR/UZR/PMR etc available, but that’s not the point under discussion.
No, a single-season FRAA score doesn’t tell you much. A single season of UZR tells you more. You ideally want to use multiple seasons of all of them.
"but it’s not some Pancake Flop sort of thing where you’re just chucking in numbers that you think sound good."
I actually disagree with this, just from a mathematical standpoint. As far as I’m aware, FRAA is not a PbP metric (correct me if I’m wrong, of course). Non PbP defensive metrics can’t give you good results because you end up looking at a binary output (plays made/not made) without any sort of context. Trying to extract information out of that system isn’t going to give you understanding, it’s going to give you noise.
I’d suggest reading about systems like DRA. (I’m not sure that DRA is better than FRAA, although I suspect it is, but I do know the published explanations are much more detailed.)
These fielding evaluation systems are not as good as ones based on PBP data – and never will be. But they are, in my estimation, better than nothing.
Thanks for the link, skimmed through it so may be missing a little...
It’s an interesting idea, but regressing the traditional stats to runs across the whole league doesn’t seem like it’s fine grained enough to look deeper than team sort of level, and obviously all of the problems with traditional stats then apply.
Philosophically, I just don’t like the misuse of statistics in support of bad arguments (see: the Mariners roster construction over the past few years). I guess I shouldn’t take that out on the stats themselves, but I do.
An Issue
Essentially every defensive metric has it flaws and very few show a consistent year to year correlation for a player. That being said, it is possible that Ibanez was ‘above average’ this season. The other metrics may disagree, which proposes the idea that maybe the BP system is flawed, however, it isn’t evidence that the other metrics are 100% accurate.
Good pick up though.
very few show a consistent year to year correlation for a player.
Isn’t it possible that players aren’t consistent year to year?
This play alone
probably shoulda scored a -20 on any system.

i've seen that
it’s pathetic…
how much worse is that play then a third basemen rushing a throw and sailing the ball 5 or 6 rows deep in the stands? or a right fielder missing his cut off man by 30 feet?
i’m not defending ibanez here, he is terrible. and consistently terrible. what i’m saying is that defensive measures aren’t 100% accurate.
well
I’m a complete amature that sucked in little league and in college I’ve never made a single throw THIS bad.
The difference between a outfield throw and a infield one is that a outfielder is usually just aiming at a general area instead of a precsive target. they usually don’t throw after makign a diving stop or something like that.
by RollingWave on Dec 13, 2008 10:32 PM EST up reply actions
That play has nothing to do with range...
No, that’s not a serious comment.
I hear a lot that left field in Safeco is a difficult position to play? Does that just mean that there’s a ton of space, so a faster, better OF would track down more balls? Or is there possibly any reason that Ibanez might be grading out worse than he actually is (like with the Green Monster)?
Beyond the Boxscore // Calling BJ Upton lazy is lazy.
by Sky Kalkman on Dec 14, 2008 12:18 PM EST up reply actions
It means that there's a tonne of space, more or less, and there tend to be more opportunities
There’s nothing particularly fancy about playing there, and we definitely have had competent outfielders in left in the past. Randy Winn comes to mind.
The problem with Raul is how stiff he seems. There’s nothing fluid about the way he plays. Awkward first step. Snaking, jerky, slow route play, a habit of diving/stabbing at balls, taking bad risks, etc etc etc. Even his highlight catches look amazingly awkward.
Like Graham said, it is plainly obvious that Ibanez does not take anywhere close to the correct route to fly balls/line drives and occasionally grounders (as documented on Lookout Landing).
Webmaster of Driveline Mechanics
http://www.drivelinemechanics.com - An Unconventional Look at Scouting
It certainly is possible.
As MGL likes to say, it’s impossible to be 100% certain that Barry Bonds is a better hitter than Ben Sheets using only sample data. There is a possibility that Ibanez was an average fielder in 2008. You’d have to think it’s unlikely, though.
Ibanez is simply anecdotal evidence of the flaws with FRAA/FRAR. I’m sure if I wanted to spend the time I could find similar anecdotal evidence of flaws with UZR/Fielding Bible/etc. The difference is that we know that FRAA/FRAR is flawed in several ways that UZR/Fielding Bible/etc isn’t; once we have UZR/Fielding Bible/etc we don’t need FRAR.
There is no way anyone watching could come close to thinking Raul was an average fielder last year
Any metric that has him as such is completely broken.
You cannot, cannot, cannot, cannot, cannot...
…cannot, cannot, cannot, cannot, cannot, cannot, CAN NOT
CAN NOT
prove the validity of a metric (or the lack thereof) based upon observation of one player in a single season!
Again: CAN NOT.
If you could, we wouldn’t need metrics to begin with!
This is the same stupid arguement that leads Boswell to dismiss VORP because there’s no way Ryan Howard can be that bad, or people to say that UZR is obviously wrong because Ichiro is t3h uber, or that tRA is flawed because there’s “no effing way” that Javier Vazquez outpitched Jon Lester. Wrong, wrong, wrong, wrong, wrong with a side of wrong, covered in wrong and served with a nice tall glass of iced wrong.
Because you absolutely cannot evaluate the validity of a metric based upon observation of one player in a single season.
by cwyers on Dec 13, 2008 4:43 PM EST up reply actions 1 recs
It's observation of the last few years, not just this year.
If you sat in Safeco field for one game this year, there was no doubt in your mind that Raul was below average. If you sat out there for 65-70 like I did, and you had to watch him start every game in LF, you would’ve wanted to shoot yourself. I know I did. Any metric that called Raul Ibanez above average in LF last year is wrong, plain and simple.
Yesterday's Pants
A blog-thingy about the Mariners and stuff.
by BrettJMiller on Dec 13, 2008 7:35 PM EST up reply actions
Though he was like +2 in 2006, I think...no one was really complaining about him then, either.
We got to watch his decline from averagish to god-awful during his time in Seattle.
Yesterday's Pants
A blog-thingy about the Mariners and stuff.
by BrettJMiller on Dec 13, 2008 7:35 PM EST up reply actions
As T.S. Eliot once wrote...
…“The last temptation is the greatest treason
to do the right thing, for the wrong reason.”
I’m not trying to get into an arguement over the specifics of either Raul Ibanez’s defense or BP’s fielding stats – I think they’re both bad. That’s completely irrelevant to my point.
The point is that the relevance of a metric need to be on the merits – as in, its inputs, its methodology, and systemic testing of its results. Many Yankees fans will swear up and down that Jeter deserved every one of his Gold Gloves – does that make defensive metrics in general wrong?
If we do not leave some room for our metrics to disagree with us – if we presume that they MUST confirm our preconceptions – then we don’t need metrics. It’s as simple as that. Now, we can use our preconceptions as a sanity check, to see if there’s areas to explore or test.
But the claim “X metric is obviously wrong because it disagrees with me about a player on my favorite team” is both wrong and pervasive. Sure, in this case it’s a petty crime – the process may be wrong but the results are correct, so what’s the harm, right?
But it’s a bad arguement used to reach a lot of bad and harmful conclusions that are a detriment to the advancment and popularization of a better understanding of baseball.
by cwyers on Dec 13, 2008 9:58 PM EST up reply actions 1 recs
so we shouldn't use backward thinking?
Adopted Giant: Aaron King
Wearing the crown by 2011. Or at least the LOOGY hat
Going back to an earlier point:
As MGL likes to say, it’s impossible to be 100% certain that Barry Bonds is a better hitter than Ben Sheets using only sample data.
Which is why anyone who only uses pure data in player evaluation is being totally irresponsible.
BP’s defensive metrics aren’t wrong because Raul was somehow average last year – they’re wrong because they’ve never made any sense to begin with. And there is no chance at all that Ibanez was an average defender last year.
Dr. Johnson kicked a rock to refute Bishop Berkeley, and Raul Ibanez booted a dribbler to refute FRAA.
by oldjacket on Dec 14, 2008 12:07 PM EST up reply actions 1 recs
tremendous
OMG Banny. FWIW I am only crdtng u w/3 runs allwd bc of DDJ OMFG
by Matt Klaassen on Dec 14, 2008 11:21 PM EST up reply actions
UZR
has questionable inputs, questionable methodology and is not systemically tested as far as I know.
Your phraseology – that it must confirm preconceptions – could also be stated as it is clearly wrong using observation. Saying that UZR clearly fails and not knowing why or how often isn’t a crime, petty or otherwise.
I think an honest discussion of the flaws of a metric like UZR would be useful to a better understanding of baseball but it would be a detriment to those who want to toss it into a stew of estimations and approximations to reach absolute conclusions like “player X is worth X wins.”
No one is claiming these conclusions are absolute and most people will readily acknowledge the shortcomings of advanced defensive metrics.
by Aaron Campeau on Dec 20, 2008 12:47 AM EST up reply actions
Questionable methodology?
Care to elaborate?
Beyond the Boxscore // Calling BJ Upton lazy is lazy.
by Sky Kalkman on Dec 20, 2008 11:52 AM EST up reply actions
Measuring performance as if they are isolated individual actions.
Assigning arbitrary zones to players regardless of the position of the player.
Using raw data of unknown quality that is part of a sales product. Remarkably the data from the two main sources often disagrees though it is supposed to be collected in the same neutral process.
You're conflating UZR and ZR on at least one point.
Players are not assigned zones in UZR – instead, their performance in each zone is compared to the performance of their peers at that position in each zone.
To clarify...
ZR = big zones, one per position. all plays in zone count as a whole play made or not made.
UZR = lots of little zones. all plays everywhere count as fractions of wins or losses, based on the probability MLB fielders at the player’s position turn that type of play (location, trajectory, handedness, park, etc) into an out or not.
Beyond the Boxscore // Calling BJ Upton lazy is lazy.
nope
but it doesn’t make it any less hilarious
(sorry M’s fans)
Adopted Giant: Aaron King
Wearing the crown by 2011. Or at least the LOOGY hat
Was it? I thought something like that would only be funny in hindsight or from an outsider’s POV.
Adopted Giant: Aaron King
Wearing the crown by 2011. Or at least the LOOGY hat
I think you underestimate the love for shenanigans that the fans have here.
Webmaster of Driveline Mechanics
http://www.drivelinemechanics.com - An Unconventional Look at Scouting
Felix's reaction has to be the best part of that video.
Every day I hear about Seattle sports' failures. Every night I fall asleep to the sound of my own tears.

by 


























