Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Diego Sanchez and the Dangers of Fame in MMA

"It's not fair to Coco" - a look at career UZR/150

 

Question: How many games does it take to accurately determine a player's UZR/150 defensive rating (Link to explanation of of UZR and UZR/150)?

 

Why I asked the question: In The Hardball Times 2009 Baseball Annual, Coco Crisp was given a single season defensive grade of F- on an A to F scale.  The A to F grades are based on PZR defensive metric.  It seems that he was not the worst possible center fielder and his ranking was possibly unfair to his true talent. I needed to find out how close a single season's value corresponds to a player's lifetime value.

Analysis: For the study, I needed to find at what levels are seasonal UZR/150 reflective of lifetime skills. I used UZR, which correlates closely to PZR and was easier for me to obtain and maintain a consistent defensive metric with my work.  I chose to use the data for the 43 Center Fielders who had a combined 1000 innings played in center field since 2004 (year that UZR/150 goes back to). This data was obtained from wwwl.FanGraphs.com.

Note on Data: I wished to increase the sample size, but being able only being able to copy and paste each player's stats, so I limited the scope on this initial look. Also the data only goes back to 2004 for UZR and it is only divided into single season data. I understand that there is some definite issue about lack and quality of the data, but I feel I used the best that is available.

For these 43 players, I wanted to compare the player's available lifetime UZR/150 to the UZR/150 the player had for each year. I used standard deviation to compare the difference of the two values. I also divided the data for the number of games played in a season. From the research I got the following results:

 

Minimum Games Played Minimum Innings Played Standard Deviation
0 0 19.98
25 225 6.56
50 450 5.95
75 675 5.81
100 900 5.24
125 1125 5.36
150 1350 6.45

 

Star-divide

So generally, 66% (1 standard deviation measure that value that 66% of all available values fall in) of all players will have have their UZR/150 vary from 0 to 6 runs while 95% of all players will have their UZR/150 vary from ~ 0 to 12 runs.

Also I looked at how a player's lifetime UZR1/150 compares to the UZR/150 at each time in the player's career. Again I used standard deviation to measure the variation:

 

Minimum Games Played Minimum Innings Played Standard Deviation
0 0 22.92
50 450 3.45
100 900 3.07
150 1350 2.65
200 1800 2.78
250 2250 2.74
300 2700 2.14
350 3150 2.09
400 3600 1.88
450 4050 1.67
500 4500 1.39
550 4950 1.18
600 5400 1.05
650 5850 0.80
700 6300 0.60
750 6750 0.7

 

As you can see, over a player's career they get within 2 points of their final UZR/150 within 400 games or ~3 years of time. The reason these numbers have less variation vice the yearly data is that with so few of years worth of data, and with some players only having a couple years in majors, the final value me be closer to the current value.

To make sure the previous data isn't too far off, I took the 24 players that had 5 or more years in center field and compared their UZR average from the first 3 years to the their final. They had a standard deviation of 2.62 runs over 1028 innings which compares close to the preceding graph.

Next, I need to find how UZR/150 changed as the players aged. I grouped the players depending on their age and got the following information:

 

Age UZR/150
22 0.02
23 -0.06
24 0.48
25 0.97
26 0.92
27 0.01
28 1.08
29 0.64
30 0.54
31 -0.06
32 0.00
33 -0.62
34 -0.22
35+ -0.8

 

It seems that defensive the center fielders peak between the ages of 25 and 28 (not sure what they are doing in their 27th year) and seem to decline to the league average at the age 31. Not great information here, there does seem to be an ~2 point swing in UZR/150 over a player's career.

Finally, did Coco deserve better than his F- ranking. To begin with her is a look at Coco's available defensive stats over the years he has played center field:

 

Season Team Position Innings Lifetime Games UZR UZR/150 Lifetime total UZR Lifetime UZR/150
2002 Indians CF 269 30 -0.2 -1 -0.2 -1
2003 Indians CF 462 81 -2.2 -6.43 -2.4 -4.43
2004 Indians CF 807 171 6.6 11.04 4.2 3.69
2005 Indians CF 79 180 -1.7 -28.98 2.5 2.09
2006 Red Sox CF 900 280 0 0 2.5 1.34
2007 Red Sox CF 1216 415 22 24.42 24.5 8.86
2008 Red Sox CF 886 513 -8.9 -13.56 15.6 4.56
Totals 4620 15.6

 

Since the data in the annual was base on just the 2008 season, his score of -13.56 over 1/2 a season corresponds to their grade of F- . The problem is that 1 year's worth of data is not enough to determine how well a player plays defense over his career. Coco's defensive numbers have been all over the place over his career (much more then the average center fielder), but his lifetime value of 4.56 for UZR/150 is pretty close to his ability (average center fielder – Grade: C). In 2008, he was about 18 runs off his lifetime average, which is about 3 standard deviations (18/3) off. Only 5% of all players will vary is much in a season, but it is not totally out of the norm. In my opinion I think Coco's grade of F- is not indicative of a actual ability and seasonal defensive grades, lifetime or 3-year grade should be used for reference also.

 

Comment 47 comments  |  0 recs  | 

Do you like this story?

Comments

Display:

I think the F- for this year is probably unfair

but although I’m not great on the math, having looked at Crisp’s defensive numbers in a number of different system at the time of the trade, I think it’s fair to say that he’s not obviously Willie Mays out there, either, particularly in the Indians years, where he played about as much LF as he did CF. Age and injuries also can take their toll. The relative lack of data in some of his seasons might also require a greater amount of regression (both to the Fans Scouting Report — which loves him, by the way, and to the mean), which both “helps” himn in his down years (2006, 2008) but also brings down his good years (2007).

I agree with those who thing that ascertaining the “precise” number of runs Crisp (or any other defender) is above/below average in CF going forward is a bit silly at this point in the development of defensive metrics, but I do think we can have a general idea. I think Crisp is likelly above average, but probably not in the Beltran/Gutierrez/Chavez category. Probably more like Ryan Langerhans…. who was on waivers earlier this yearl.

Bringing you more-or-less replacement level analysis and commentary since sometime in 2008.

by Matt Klaassen on Feb 12, 2009 10:40 PM EST reply actions  

Also...

seems like MGL and Tango have written some stuff on fielding aging curves, but I can only find a brief Tango article from THT in 2008 that reads like the beginning of an unfinished series. Anyway, I thought the general take I’d read from MGL elsewhere was as a the rule of thumb, between ages 24-34, fielders’ “true talent” declines by about 1 run a year.

Bringing you more-or-less replacement level analysis and commentary since sometime in 2008.

by Matt Klaassen on Feb 12, 2009 11:30 PM EST reply actions  

Talent might decline at that rate, but with this small sample...

.. there seems to be a learning curve and then a slow decrease in ability with a change of 2 runs. I would love to MGL’s article. Ten run drop off seems pretty big.

 I read Tango’s, but his % change doesn’t work good with UZR/150.

by Jeff Zimmerman on Feb 12, 2009 11:53 PM EST up reply actions  

hmmm

well, here are the charts from MGL, but not anything on methodology or data…

PDF FILE

Bringing you more-or-less replacement level analysis and commentary since sometime in 2008.

by Matt Klaassen on Feb 13, 2009 12:39 AM EST up reply actions  

also see the relevant comments

here, starting around Tango’s #320

Bringing you more-or-less replacement level analysis and commentary since sometime in 2008.

by Matt Klaassen on Feb 13, 2009 12:40 AM EST up reply actions  

btw, other kickass stuff in the same thread

from MGL and Rally talking about regressing ot speed scores for TotalZone and UZR defensive projections…. if the Book Blog, FanGraphs, BtBS, and THT had been around 5-7 years ago, BP would never have been able to make any money, ever

Bringing you more-or-less replacement level analysis and commentary since sometime in 2008.

by Matt Klaassen on Feb 13, 2009 12:49 AM EST up reply actions  

Not for sure what to take from the aging curves.

I might really be a difference because of position being CF. It would be nice to see the values spread out for each position. I have it on my ever growing list to go and look at each position.

Another thing I do differently than Tango and MGL do, is to look at players that have mininum time at position to get the comparison. They seem to group all players together no matter how little they played the position.

Tango has stated that 3B and 2B play the same, using all games at the positions, but I looked into it further when the Teahen to 2nd talked started and there is on average a ~1-2 run drop in UZR/150 when people have moved from 3B to 2B Link I used people with at least 100 innings at each position.

Link

by Jeff Zimmerman on Feb 13, 2009 10:55 AM EST up reply actions  

that's not what the grade was

I don’t understand how our letter grades relate to your subject. In the stats intro in the THT Annual, we clearly said that the fielding letter grades represent 2008 stats only. We calculated those grades to help people interpret the RZR stats.

by studes on Feb 13, 2009 7:56 AM EST reply actions  

Lets make it simple

F- If any one has ever gotten that grade in any context, you expect that person to be completely unable to perform the task at hand. Completely useless. With that assumption, Coco would stumbling around in the field and that is not the case. I just wanted to point out that an average defender (grade C), through normal noise, could vary quite a bit from year to year. A grade of F or A would not be uncommon for this player to get the next year. I just think there needs to be some context to the numbers especially since most people have a hard time understanding OPS, not alone UZR/150. There needs to be some form of lifetime numbers to put the seasonal numbers in context.

Another major problem, especially for your publication, is that not everyone can look at the data and see where the problem is located (single season data). Here is a discussion where it was brought up, and based on just this value, some people consider “… THT is garbage.” based on just this one value. Here is a link to the original dicussion

by Jeff Zimmerman on Feb 13, 2009 10:31 AM EST up reply actions  

ridiculous

Seriously, can’t anyone read? The intro even says that the grades are based on one year, and distributed evenly from A through F. They’re an interpretation of one year’s RZR.

The intro says exactly what it is. The person who called the grade “garbage” in that thread didn’t even buy the book. I would think the people who bought the book and presented the results have an obligation to explain what they mean.

I’m open to feedback to make our stats more readable, and we’ll relook at ways to help people interpret our fielding stats for next year’s book. But you’ve just made things worse by using our grade as an intro to your post, when we clearly stated what the grade means. You’re more responsible for people thinking our stats are “garbage” than we are. That’s extremely disappointing.

by studes on Feb 13, 2009 11:15 AM EST reply actions  

Thanks for jumping in here, studes

in the parallel discussion at Royals Review, I was trying to exaplain that the stats were single season.

Love the Annual, overall. (Even if I do prefer UZR to RZR/OOZ, you guys were way ahead of the curve in even having those on your sight, and the grades make sense to me as relative, single season evaluations)

Bringing you more-or-less replacement level analysis and commentary since sometime in 2008.

by Matt Klaassen on Feb 13, 2009 12:22 PM EST up reply actions  

Studes, sorry you feel that way.

I really wanted to find the variation of defensive metrics from season to season and the grades at the THT where the reason I looked into it. I was hoping for the discussion to move more towards when defensive stats are reliable instead of an attack on the THT. THT is not the only publication that uses defensive stats, but no one has looked into their variations and have it available for the public to use. Sorry again if I have cause any grief or hard feelings.

My research my seem to be a little backwards in that I start with a question and then go for the answer, not knowing what I was going to find. I might have found that Coco’s low grade was justifiable, but in this case, I didn’t.

by Jeff Zimmerman on Feb 13, 2009 12:41 PM EST up reply actions  

the connotation

of F- and -15 are a tad different to be fair

by ZeppelinDZ on Feb 13, 2009 2:45 PM EST up reply actions  

good point

The F- is being used out of context here, and that is leading to certain assumptions about what it means.

by studes on Feb 13, 2009 2:47 PM EST up reply actions  

Not when the explanation of what the F- means is right there.

For a single season’s worth of data, they both mean “really crappy”.

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Feb 13, 2009 2:55 PM EST up reply actions  

but the whole point of assigning a letter grade or some other symbol to a number

is to make it easier to understand. It didn’t do that obviously (regardless of whether its a reading comprehension issue) so it failed to communicate the idea. That’s not good writing, technical or otherwise, period.

by ZeppelinDZ on Feb 13, 2009 3:11 PM EST up reply actions  

exactly

It’s intended to help our readers interpret the single-year RZR stats. Pretty straightforward, really, and it’s described several places in the book.

by studes on Feb 13, 2009 3:13 PM EST up reply actions  

What do you mean "obviously"?

We’re talking a couple people here, right?

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Feb 13, 2009 3:25 PM EST up reply actions  

A discussion of when defensive stats, specifically UZR, is reliable assumes that they are reliable at any point. You also begin your article with the question of how much data is necessary to accurately determine UZR. That’s a different question than determining how able a fielder Coco Crisp is.

by ol Pete on Feb 13, 2009 2:38 PM EST up reply actions  

it was justifiable

You’re totally missing the point. The grade wasn’t subjective, and it wasn’t a long-term grade. It was an interpretation of his one-year RZR. It was completely “justifiable” because it was a straightforward interpretation of his RZR stats. It was designed to help readers interpret the stats.

If you had investigated the consistency between UZR and RZR, that would make sense. But you didn’t. You misrepresented our letter grades instead, and now some people think we have “garbage” stats.

by studes on Feb 13, 2009 1:07 PM EST reply actions  

here's the quote

This is the “garbage” quote TucsonRoyal referred to:

“Everything I’ve seen in this thread makes me thing (sic) THT is garbage.”

The guy wasn’t talking about our fielding stats. Just referring to our stats in general, because of the way the fielding stats were presented on that thread.

by studes on Feb 13, 2009 2:01 PM EST up reply actions  

sure

We’re not picky about who buys our books.

by studes on Feb 13, 2009 2:21 PM EST up reply actions  

well...

I think you know this, but I should say publicly that THT isn’t a money-making site. We do run ads and sell books, but we have costs, too, and there’s less money to be made on ads and books than you might think. Particularly these days.

More to the point, THT doesn’t have “owners” that we’re making money for. The little cash that is in the bank at the end of each year is distributed to our writers and editors for their hard work.

by studes on Feb 13, 2009 3:12 PM EST up reply actions  

I understand

Just wanted to make the comment in case anyone comes across this and misinterprets.

by studes on Feb 13, 2009 3:27 PM EST up reply actions  

Personally...

I feel like the reactions are getting a bit out of hand. Fielding data that lacks context isn’t of much use, and he was simply writing an article stating that a single year’s worth of fielding data does not suffice for quality analysis.

I understand that this makes it seem like the stats are “garbage”, as someone so eloquently put it, but let’s be serious about this for a moment. If someone is willing to toss away an entire website’s worth of data, much of which is useful, because they see one thing that they don’t like, then it wasn’t going to take much for that person to come to this decision. It’s not anyone’s “fault” that this person now thinks THT can’t help them. Chances are even better that this person already disliked THT’s stats and was just using the thread as a place to say so.

The validity of the stat stands, because as Studes said, they are translated from RZR, and that’s what Crisp’s RZR from 2008 merited. As I and others have said though, one year’s worth of data isn’t much help. Maybe it would be more effective to run the last three season’s worth of RZR grades in the next edition, in order to better convey what kind of defensive player you are looking at? That way, you don’t so easily turn off those folks who are quick to dismiss entire publications when they don’t agree with something.

by Marc Normandin on Feb 13, 2009 2:37 PM EST reply actions  

here's the quote...

I just want to keep my comments in context here. This is what TucsonRoyal said when I first brought up the subject:

Another major problem, especially for your publication, is that not everyone can look at the data and see where the problem is located (single season data). Here is a discussion where it was brought up, and based on just this value, some people consider "… THT is garbage." based on just this one value. Here is a link to the original dicussion (sic).

TR is the person who brought this up, not me, and he pointed to this post as evidence that our stats aren’t well-constructed. My reply is that TR misinterpreted our stats and based, on his second comment, I think he’s still misinterpreting them.

Plus, obviously the context of the original question doesn’t fit the article, since our grades are single-year grades and TR’s article is about multiple year stats . I feel an obligation to point that out, and I don’t see why that’s out of hand.

by studes on Feb 13, 2009 2:45 PM EST up reply actions  

Don't mean to single you out.

I’m more concerned with giving you some of the feedback you asked for earlier in regards to improving the readability of the stats. Having a few season’s worth of data would be a plus, because it tells the story of their play better than a single grade.

I’m honestly not concerned about the argument in this thread that much. I just think everyone on both sides seems a little defensive, so I’m staying out of that part. Just wanted to give my two cents on the issue of improving things for later.

by Marc Normandin on Feb 13, 2009 3:16 PM EST up reply actions  

defensive?????

Thanks for the feedback. The issue is that our book is a single-year book, and all of our stats are single-year stats. I think it would be off to take one stat and make it based on more than one year.

Our Season Preview gives a projection of how good each fielder is going to be, and that’s obviously based on past performance, aging curves, etc. We give Crisp a “D” in the Preview.

by studes on Feb 13, 2009 3:30 PM EST up reply actions  

Some interesting research on the stability of fielding metrics, no?

Maybe we should focus on that? Whether you rate Coco’s fielding as -25 runs, an F, or +2 runs in 2008, we know one partial season’s worth of fielding data isn’t that precise. Both in assessing a player’s actual performance that season AND in assessing his true talent level.

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Feb 13, 2009 2:38 PM EST reply actions  

There IS an interesting point in here that deserves some more discussion, too.

For something like batting average, we know EXACTLY what a player’s AVG is in any one season. When we try to judge a player’s true talent level, we know that AVG can fluctuate for statistical reasons and one season’s AVG is just a sample. Tango likes to write var(tot) = var(skill) + var(luck)

With defensive runs, however, there’s an extra piece. UZR obviously fluctuates year to year [var(luck)], but it’s also not a perfect measure of defensive performance. There some variability between a player’s seasonal UZR and his actual season on-field performance. var(tot) = var(skill) + var(luck) + var(inaccuracy of statistic).

For a single season, we might not care about var(luck), because we want to know what actually happened, but var(inaccuracy of statistic) is still an issue and may require regression.

I’m not sure how you measure var(inaccuracy of statistic) separately from var(luck), either. And that’s an important question.

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Feb 13, 2009 3:49 PM EST up reply actions  

Data I accumulated at the time of the trade

Following up on Sky’s last comment on stabillity — here are Crisp’s CF ratings from a few different PBP systems, 2006-2008. To makes things more straightforward, I won’t prorate these per 150 games or whatever, so we just have sort of “raw data.”

Dewan’s
2006 minus 7 (minus 5.6 runs)
2007 + 26 (+ 21 runs)
2008 minus 2 (~ minus 2 runs)

PMR
2006 + 14 (~ +11 runs)
2007 + 11 (~ +9 runs)
2008 minus 1 (~minus one run)

sUZR
2006 minus 9 runs
2007 +13 runs
2008 minus 10 runs (note that this is taken from a comment from MGL in a Book Blog thread where he noted Crisp was in the double digit negatives sUZR in 2008…. it could be worse, I just went conservative)

bUZR
2006 +0.0 runs
2007 +22.0 runs
2008 minus 8.9 runs

Make of this what you will. I realize that some systems include arm ratings and others only measure range.

While the “amounts” vary from system to system, they all agree (except for the outlier of PMR’s 2006) on a general contour of bad year, great year, bad year….

Just from these three years of data (without looking at the Cleveland data, of course), I’d say that while Crisp is probably around average, his reputation as an awesome defender is greatly exaggerated.

Bringing you more-or-less replacement level analysis and commentary since sometime in 2008.

by Matt Klaassen on Feb 13, 2009 3:13 PM EST reply actions  

maybe more accurately

low average, great, below average

Bringing you more-or-less replacement level analysis and commentary since sometime in 2008.

by Matt Klaassen on Feb 13, 2009 3:16 PM EST up reply actions  

And I don’t see how someone can look at multiple years (and therefore a good sample size) of multiple fielding metrics and come to the conclusion that Crisp projects to being a “D” defender in 2009.

The immoderate moderator

by Scott McKinney on Feb 13, 2009 4:15 PM EST up reply actions  

ask studes

Bringing you more-or-less replacement level analysis and commentary since sometime in 2008.

by Matt Klaassen on Feb 13, 2009 4:30 PM EST up reply actions  

yeah, it's interesting

First of all, I shouldn’t be commenting at all, because David Gassko ran the numbers. But the RZR system doesn’t rank Crisp’s 2007 as highly as the other systems do. In fact, it rates him below average. And without that year, Crisp definitely rates as a below-average outfielder (remember, there’s no grade inflation here. A “D” means below average, not almost failing).

by studes on Feb 13, 2009 4:55 PM EST up reply actions  

Not defending it either way...

But a D as a center fielder is different from a D at another position, too. More context. And why looking at just RZR or just ANYTHING can miss sometimes (not that it necessarily misses on Coco.)

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Feb 13, 2009 5:48 PM EST up reply actions  

Sky

What do you mean mean “More context?” Just a confusing phrase there…

Bringing you more-or-less replacement level analysis and commentary since sometime in 2008.

by Matt Klaassen on Feb 13, 2009 11:29 PM EST up reply actions  

excellent post, Tuscon.

I don’t think anyone really understands any stat until they know its standard deviation.

It’s hard to even get a grip on what it means to have a single-year UZR until you know that number. It’s also a first step to understanding Sky’s “var(inaccuracy of statistic)”.

So this is a valuable addition to THT’s analysis, not a criticism of it.

Keep up the good work.

by Sean O Se on Feb 24, 2009 3:34 PM EST reply actions  

Kudos to studes

I wish BP’s writers were this concerned with what fans think of their stuff :D

by Omar Little on Feb 27, 2009 4:22 AM EST reply actions  

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Why be a member?

FanPosts

Community blog posts and discussion.

Recent FanPosts

Small
Prince Fielder in Comerica Park
Crystal_ball_small
Sparky vs Buck
Img_3830_small
BtBS Fantasy League
Small
Context Neutral Run and RBI projections
Small
Free Agent Compensation
Img_0001_small
Value of Various Plate Approaches
Strike_three2_small
Effect of Foul Area on Strikeouts: AL 1954-68: Erratum
Small
Baseball on a stick
Small
Player Evaluating Statistic
Baseball_small
Rays Outfield: Cheap but Extremely Productive

+ New FanPost All FanPosts >

Follow us on Facebook!

Follow us on Twitter!

SaberGraphics

MLB Daily Dish

Get the latest MLB Trade Rumors, Transactions, and News at MLB Daily Dish!


Managing Editor:

Jbopp-kc_small Justin Bopp

Columnists:

Adam_small adarowski

Dme_small Satchel Price

Closeup4_small J-Doug

Carlosicon_small Julian Levine

Billy_and_daddy_4th_of_july_small Bill Petti

Featuring:

Dayton_small Jeff Zimmerman

12475953_small Jacob Peterson

Picture-6_small Chris St. John

Btbpro_small Dave Gershman

229331_10150183361996591_674441590_6760167_6637860_n3_small Lewie Pollis

Img_3830_small David Fung