Top 5 WAR Active Leaders
Graph of the Day
More after the jump...
1. All data through 2009, via Baseball Reference.com
2. Each line is an auto-generated polynomial (^4) line of best fit, except for Pujols (^2) because his created a weird upward-sloping line.
3. Sorry about the color change for Chipper.
30 comments
|
1 recs |
Do you like this story?
Comments
Yum!
I think the data is telling us that Albert is going to get better as he ages.
It’s irrefutable because it’s SCIENCE.
Our feeble "math" cannot adequately explain or predict The Pujols.
But seriously, these are cool.
Pujols has such a ridiculously small number of seasons compared to these other players that it is still really hard to compare him. A-Rod, Jones, and Jeter all have a nice “second bump” on their curves, while Griffey does not. I guess the question is whether Pujols will be able to sustain high levels of production and delay a sharp decline or if he will just kinda get worse year after year (like Griffey). Even with the “feeble” binomial curve (Albert Pujols laughs at polynomials with such a small number of exponents) his career would be really impressive.
Albert Pujols does not have "down" years. He has "~6 WAR" years.
by mattybobo on Jun 1, 2010 11:49 AM EDT up reply actions 1 recs
This made me laugh, mattyboob.
Something I learned when pulling this data:
Most players, or at least these sure-fire HoFers, have a season or two of -0.4 to -0.6 WAR at the beginning and ending of their ML careers.
But not Pujols. His career begins with BLAMMO, I GOT WAR.
See Data Differently. Beyond The Boxscore. | Follow me @justinbopp
Two Out Rally, the new BASEBALL MMORPG! | Facebook | Twitter
It really is impressive
And also rare that he got to start off with a full season’s worth of playing time and has yet to have a really shortened season (I think 2006 was 147 games or something like that, and that is his lowest number).
As a Cardinals fan I just try not to think about “Albert Pujols” and “decline” at the same time because my whole world will start crumbling around me.
Albert Pujols does not have "down" years. He has "~6 WAR" years.
No, it's more recent
Just the last few years. It corresponds to him winning that batting title and taking a run at hitting 400.
By the by
To be fair, the fit curves being chosen are affecting the way we see it a bit. There aren’t enough data points for a curve of much higher exponent to be sensible (it just gets real jagged, essentially fitting the points perfectly, which isn’t what we want – we WANT smoothing)… But at the same time, the lower exponent curve dramatically limits how much the curve can change for us to reflect what’s happening.
In fact, a 4th power polynomial can only change directions 3 times. (Slope is derivative, derivative is 3rd order, and therefore has 3 points where it equals zero.)
I’m not saying it should be done differently – the closer you get to having the same power of function as number of data points, the uglier and less meaningful the smoothing curve is – just noting that fact, which gives a slightly false sense of meaning in the curves.
Good points, Patrick.
And something I gave some amount of thought. I settled on something that would provide as much meaning as possible while still looking smooth.
In particular, the Pujols one was tough to present because he has half the data that the others do (15, 16, 16, 22 seasons).
In the end, I chose the current polynomials for smoothness but provided actual data points for context.
See Data Differently. Beyond The Boxscore. | Follow me @justinbopp
Two Out Rally, the new BASEBALL MMORPG! | Facebook | Twitter
I like it :)
It’s frustrating that the smoothing lines are so vague, but I don’t see how they COULD be better.
I like the graphs. :)
And it tells you a bit about how amazing Puzols has been. He’s already solidly in the HOF range by WAR and he’s got a long way to go yet.
Something worth considering
If the issue is that you have too few data points to smooth, you could break down to a month-by-month and do a weighted average for visualization purposes (kind of like the curve used for a kernel density function, with the kernel doing the smoothing for you). Hell, you can use game-by-game data for something like that, and let the data tell you what the optimal parameters are for the kernel smoothing. That allows you to get something like optimal smoothing endogenously.
You could also smooth by doing a weighted average with neighboring seasons.
Maybe even do it with WAR/700 instead of raw WAR to make it more of an estimator of talent level.
Another thing -
my original idea for the creation of these was to establish something like Sky established with the HOF Zone on his WAR Graphs. I’d like to establish a “HOF Arc” or something. I think it could look absolutely fantastic.
Sky, others – helpage with the data?
See Data Differently. Beyond The Boxscore. | Follow me @justinbopp
Two Out Rally, the new BASEBALL MMORPG! | Facebook | Twitter
by Justin Bopp on Jun 1, 2010 12:33 PM EDT up reply actions 1 recs
That's an interesting thought...
Probably MOST Hall of Famers, at least position players, will mirror an exaggerated version of the standard aging curve. Exaggerated in that it will be so much higher, in terms of WAR.
Pitchers… HOF curves for pitchers would likely be MUCH more interesting, considering MGL’s pitcher aging data basically says “The average pitcher stays flat then gets worse”.
Consider this my current project, then.
See Data Differently. Beyond The Boxscore. | Follow me @justinbopp
Two Out Rally, the new BASEBALL MMORPG! | Facebook | Twitter
I'm also looking at something
about using the data to look at successful teams.
I did a thing with the Big Red Machine some time ago which showed how their different WAR curves lined up and peaked right around their historic run of success.
Let me see if I can find it. You’ll definitely see what a ‘jittery’ line looks like. Maybe this method could help smoothe it out a bit.
See Data Differently. Beyond The Boxscore. | Follow me @justinbopp
Two Out Rally, the new BASEBALL MMORPG! | Facebook | Twitter
I'VE GOT THE JITTERS...
A little more seriously, I think in this case this is probably appropriate.
Maybe an overall smoothed curve to illustrate, but in this case we’re interested in the season-by-season data just as much.
I’d be curious to SEE the smoothed version, but I think this works well in this case… And some players have very, VERY uneven careers and a smoothed curve might not be the best thing if we’re interested in how they did as a group in one specific season.
And holy shit Joe Morgan was good at baseball. My own personal bench mark rates 10 WAR as “OMGWTF” and 12 WAR as just “Holy shit :O”.
I’m surprised Jeter is Top 5, but the fact he’s been pretty healthy and consistent over his career says a lot.
What I find more interesting are when you go beyond the Top 5 active leaders:
Pudge at #6, Thome at #7, Jim Edmonds #8, Manny #9, and Rolen #10, Andruw Jones #11…
I think it’s too bad Ichiro started playing in the MLB so late when looking at stats like this too. Going into 2010, he was averaging a WAR of 5.63/season, higher than Jeter and Chipper (and Griffey, but he should’ve retired a while ago).
Both A-ROD's and Jeter's WAR benefit
from playing at a valuable position.
See Data Differently. Beyond The Boxscore. | Follow me @justinbopp
Two Out Rally, the new BASEBALL MMORPG! | Facebook | Twitter
Chipper's too
I’m guessing. Speaking of which, Chipper’s WAR (A-Rod’s too) probably took a hit when he agreed to move to the OF for his age 31 and 32 seasons to make room for Vinny Castilla.
If Albert played in the AFL, they’d have to rename it the AZ/NM Fall League, based on where his homers landed.
A-Rod...
Never moved to the outfield for Vinny Castilla… … :)
But yes, as someone pointed out a while ago (Craig Calcaterra, maybe?) the only reason “Second best shortstop of all time” is still a debate (Jeter and Ripken, and Vaughan if you want) is because A-Rod left the position.
If he hadn’t, it’d be Honus, then A-Rod, then eeeeeverybody else.
So you're saying Tony Pena Jr. won't be considered? :(
See Data Differently. Beyond The Boxscore. | Follow me @justinbopp
Two Out Rally, the new BASEBALL MMORPG! | Facebook | Twitter
Nah, maybe he could try for...
2nd WORST shortstop ever. I’m pretty sure Yuni’s got #1 locked up.
I like the color schemes on these graphs a lot.
Did you use a rolling average for the curve? I’m not sure I followed the above discussion for that.
Just had a few quick thoughts.
Somehow this always happens right after I post.
There are a few places where the curve doesn’t really stand out given the bars — any chance of changing its color (or perhaps giving it a thicker border)?
Also, how hard would it be to represent the projection parts of each curve with a dashed line instead of a solid line?
Curves...
No, jwiscarson – He used a best fit line to a polynomial.
Not sure if you’re familiar with the concept – Or exactly what method he used, but I’m guessing probably just what’s built in to excel for a polynomial fit, which I believe is least squares:
http://en.wikipedia.org/wiki/Least_squares
So he fit it to an equation in which the highest order term was x^4 (except Pujols, who had a 2nd order equation).
IE, an equation of the form: a*x^4 + b*x^3 +c*x^2 + d*x + e, set up with the terms to minimize the error relative to each data point. (You could think of it as a method designed to minimize the total distance of the graph line from every specific data point. And, hey, that’s what it is, so you’d be thinking right.)





































