Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Dissecting Nick Diaz's Positive Drug Test

Quantifying the Impact of Defensive Uncertainty

Recently in the sabermetric community there has been a lot of discussion about fielding stats and their inclusion in WAR (see for example this thread, or this one at The Book blog) given the uncertainty behind the data (batted ball type, hit location etc.). With that in mind I thought it would be an interesting exercise to see how applying uncertainty to the defensive runs above average (DRAA) numbers affects the 2009 fWAR leaderboard. My method for applying the uncertainty is pretty simple; I just ran a Monte Carlo simulation using a normal distribution for the simulated DRAA with a mean of the DRAA reported by Fangraphs and a standard deviation of 5 runs. The following table looks at how often the top 10 players in fWAR fell into each of the top 10 slots after running the simulation 10000 times.

 

1 2 3 4 5 6 7 8 9 10
Albert Pujols 62% 23% 9% 4% 1% 0% 0% 0% 0% 0%
Ben Zobrist 22% 36% 22% 11% 5% 2% 1% 1% 0% 0%
Joe Mauer 12% 24% 29% 17% 9% 5% 2% 1% 0% 0%
Chase Utley 3% 8% 16% 23% 19% 13% 9% 5% 3% 1%
Derek Jeter 1% 4% 10% 16% 18% 17% 13% 9% 5% 3%
Hanley Ramirez 0% 2% 7% 12% 16% 18% 16% 11% 8% 5%
Evan Longoria 0% 2% 4% 10% 14% 17% 17% 14% 9% 6%
Prince Fielder 0% 0% 2% 5% 8% 12% 15% 16% 15% 10%
Ryan Zimmerman 0% 0% 1% 2% 4% 7% 11% 15% 16% 15%
Adrian Gonzalez 0% 0% 0% 1% 3% 6% 8% 13% 16% 15%

 

So if you buy my 5 run SD assumption then the impact on ordinal ranking is the above.  Clearly the impact on overall WAR (and thus $/WAR) isn't captured in the above analysis.

 

This is just a quick look at the subject, but I think there may be more to uncover like looking at different fielding metrics in place of UZR.  Either way it answered one of my questions, "What orders of magnitude are we talking about?"

Update:  Here's the same table with a SD of 10 runs

1 2 3 4 5 6 7 8 9 10
Albert Pujols 38% 21% 14% 9% 6% 4% 3% 2% 1% 1%
Ben Zobrist 22% 20% 16% 11% 9% 6% 4% 3% 2% 2%
Joe Mauer 15% 17% 15% 12% 10% 8% 6% 4% 3% 2%
Chase Utley 8% 11% 12% 11% 10% 9% 8% 6% 5% 4%
Derek Jeter 5% 8% 10% 10% 10% 9% 8% 7% 6% 5%
Hanley Ramirez 4% 7% 8% 9% 10% 9% 8% 7% 6% 5%
Evan Longoria 3% 6% 7% 9% 9% 9% 8% 7% 7% 5%
Prince Fielder 2% 4% 5% 7% 7% 8% 8% 7% 7% 6%
Ryan Zimmerman 1% 2% 4% 5% 6% 7% 7% 7% 7% 6%
Adrian Gonzalez 1% 2% 3% 5% 5% 6% 6% 7% 6% 6%

Comment 8 comments  |  0 recs  | 

Do you like this story?

Comments

Display:

Neat!

So if we think we’re within 5 runs of reality with uzr, then fielding uncertainty will bump a guy a spot or so up or down at the top of the leaderboard. I hope we’re within 5 runs!

by JinAZ on Jul 22, 2010 8:07 PM EDT reply actions  

Uncertainty Varies

I think the uncertainty about Pujols defense would be much less than, say, Zobrist’s. Pujols only plays one position, gets lots of chances, and it’s relatively easy to determine whether a chance was in zone or not.

by fjm235 on Jul 22, 2010 8:44 PM EDT reply actions  

don't disagree

I toyed with altering sd’s based on various factors, but couldn’t reconcile in my mind exactly what I wanted to do.

by stevesommer05 on Jul 22, 2010 8:55 PM EDT up reply actions  

Cool idea

A few thoughts:

 - Colin already pointed out on Twitter that a 5 run SD is probably too small.
 - Is the distribution normal? I wouldn’t be shocked if it were, but then I wouldn’t be shocked if it weren’t either. Either way, it’s not something I’ve ever looked at.

by Dan Turkenkopf on Jul 22, 2010 9:01 PM EDT reply actions  

Concur on point one. I was WAY guessing. Laziness at it’s finest. Hopefully I’ll have a table with SD of 10 here in a few minutes.

On the second, I’m not sure we know how the systemic biases would make the “answer” be off, but I’d guess normal until I had some evidence to the contrary.

by stevesommer05 on Jul 22, 2010 9:05 PM EDT up reply actions  

You should mention that this is 2009 numbers BTW

And really if your assuming UZR error is normally distributed that won’t really change anything because the distribution is still around the players’ mean UZR grade. Why not try playing around with different kinds of distributions (they would have to be skewed towards league average) or use the players regressed UZR as the mean?

by vivaelpujols on Jul 23, 2010 4:33 AM EDT reply actions  

I have that it’s the 2009 fWAR leaderboard in the first paragraph, but yeah in general it’s probably not called out especially well.

I guess it didn’t matter to me that it wasn’t going to change anything (I assume you mean drastically change the order?). In fact I kinda like that using UZR as the center “maintains” the general order because then any shifting around is due to the uncertainty in the metric not shifting to a different metric. That said, when I referenced other metrics instead of UZR, I was thinking of using my projections (so a version of regressed) as the substitute.

by stevesommer05 on Jul 23, 2010 8:08 AM EDT up reply actions  

Oh crap sorry, I guess I overlooked that sentence

I guess I just think people’s problem with UZR is not that it has a lot of error, but that the error is going to be more present in players with extreme UZR scores and it’s going to be biased to the ends of the curve. I mean let’s say you were to repeat this analysis with every fielder assumed to be zero runs above average. The types of deviations and percentages would be the same as they are now.

I think you’d have to do a distribution in a distribution to get the right effect here. Calculate the spread around each players UZR score by standard deviations than use those in the context of the league average spread by standard deviations.

by vivaelpujols on Jul 24, 2010 4:26 AM EDT up reply actions  

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Why be a member?

FanPosts

Community blog posts and discussion.

Recent FanPosts

Small
Context Neutral Run and RBI projections
Small
Free Agent Compensation
Img_0001_small
Value of Various Plate Approaches
Strike_three2_small
Effect of Foul Area on Strikeouts: AL 1954-68: Erratum
Small
Baseball on a stick
Small
Player Evaluating Statistic
Baseball_small
Rays Outfield: Cheap but Extremely Productive
Small
A new xBABIP
Small
Jack Morris "pitching to the score"
Strike_three2_small
Foul Area and Differences in SO: AL vs NL

+ New FanPost All FanPosts >

Follow us on Facebook!

Follow us on Twitter!

SaberGraphics

MLB Daily Dish

Get the latest MLB Trade Rumors, Transactions, and News at MLB Daily Dish!


Managing Editor:

Jbopp-kc_small Justin Bopp

Columnists:

Adam_small adarowski

Dme_small Satchel Price

Closeup4_small J-Doug

Carlosicon_small Julian Levine

Billy_and_daddy_4th_of_july_small Bill Petti

Featuring:

Dayton_small Jeff Zimmerman

12475953_small Jacob Peterson

Picture-6_small Chris St. John

Btbpro_small Dave Gershman

229331_10150183361996591_674441590_6760167_6637860_n3_small Lewie Pollis

Img_3830_small David Fung