Daily Box Score 9/23: Self-Similarity
The notion of consistency, that bugaboo of thinking baseball fans everywhere, has eluded our frontal attacks for decades.
What conclusions can we draw from the fact that player's performance has been stable from year to year? Is it a mere trivial fluke or parlor trick? Or does it betray some deeper ability? Is it possible consistency bodes poorly for a player's projection?
Table of Contents
Fractal Geometry
The Fractional Dimension of Nick Blackburn
Conclusions?
Discussion Question of the Day
When I was in high school, I attended a lecture given by the mathematician Benoît Mandelbrot. I asked him a question about his seminal paper (which I definitely did not fully understand). His answer, which was short and to the point, betrayed just how poorly he thought I had understood what he had written.
Consider this your fair warning that, even in the intervening years, I have not been fully able to grok the concept of fractals. However, I sincerely hope that no one is reading this column in the hopes that I have all the answers. At best, I can offer questions. Caveat emptor.
The concept of self-similarity, as it is used to describe certain mathematical functions, is relatively straightforward. Per Wikipedia:
[A] self-similar object is exactly or approximately similar to a part of itself (i.e. the whole has the same shape as one or more of the parts). Many objects in the real world, such as coastlines, are statistically self-similar: parts of them show the same statistical properties at many scales.
In fact, it was Mandelbrot who, in the paper linked above, showed that the coastline of Britain can be described as self-similar. That is, no matter the "zoom" level, a surface will display many of the same properties. In some cases (as is common with fractals), the actual appearance itself may be similar at many different resolutions.
As an aside, I'll tell you what I asked Mandelbrot. His paper suggests that surfaces that display statistical self-similarity have fractional dimensions between 1 and 2. I asked if it would be possible for a real-world surface, like the coastline of Britain, to be a fractal. (I surmise that people have been asking him this question, which is absurd really, for decades. He must be sick of explaining it.) In any event, the answer is no.
But the reason why real-world curves aren't fractals is because they only display self-similarity over certain intervals. It is not the case that any coastline fully replicates its shape at any part of itself. But the partial self-similarity, nevertheless, is what led to much of Mandelbrot's breakthrough work with fractals.
The Fractional Dimension of Nick Blackburn
One of my favorite pastimes is looking for players who have season statistics that are uncanny in some way. For example, which players come closest to a .300/.400/.500 line? (Bobby Abreu worked better for this before this year, but he's still close at .299/.404/.492.)
As I talked to a friend the other day, he pointed out an interesting feature of Nick Blackburn's current season compared to his last. Let me demonstrate:
2008: 193.1 IP, 4.05 ERA, 4.40 FIP, 96 K, 39 BB, 23 HR, 224 H
2009: 191.2 IP, 4.18 ERA, 4.44 FIP, 89 K, 40 BB, 24 HR, 230 H
Having been thoroughly rebuked by Mandelbrot, I am definitely not suggesting that Blackburn is a fractal. That would be ridiculous.
But isn't that a tad uncanny? Especially when you consider how few bats he misses, which I would have thought makes him more susceptible to variance. Of course, the fact that he had similar statistics ex post is not evidence that he was not subject to more variance ex ante. Still, weird, right?
So I went looking for other players who were their own doppelgangers. And I found a few. Let me share.
Much has been made of the fact that Adam Dunn, who currently has 38 HR, is two shy of collecting exactly 40 HR in five consecutive seasons. But that isn't the only statistic that has displayed stability:
2008: 651 PA, 40 HR, 100 RBI, 122 BB, 164 K, .236/.386/.513
2009: 623 PA, 38 HR, 103 RBI, 108 BB, 165 K, .279/.408/.556
Other than the difference in the number of singles (77 in '09 versus 59 in '08), the two lines are dead-ringers for one another.
Here's another: Shane Victorino.
2008: 627 PA, 14 HR, 58 RBI, 45 BB, 69 K, .293/.352/.447
2009: 646 PA, 10 HR, 58 RBI, 57 BB, 67 K, .297/.364/.451
Here again there is really only one substantive difference. This time, it's the number of walks.
Think I'm just cherry-picking hitters, and healthy ones at that? Nah, here's a guy who missed the same part of the season two years in a row, and his numbers STILL came out the same: John Lackey.
2008: 163.1 IP, 3.75 ERA, 4.53 FIP, 130 K, 40 BB, 26 HR, 161 H
2009: 169.1 IP, 3.56 ERA, 3.54 FIP, 135 K, 46 BB, 14 HR, 163 H
He allowed 32 doubles this year, versus 26 in 2008. If you figure that many of the extra home runs from last year have been doubles this year, he really starts to look like the same pitcher. In fact, the reason his FIP was so much higher last year is entirely because of his HR/FB rate (15.3% in '08 versus 8.0% in '09). His fly ball tendencies (34.7% in '08, 34.5% in '09) have been almost identical.
Still not close enough for you? How about the guy whose entry into the game is often likened to the sounding of taps, Mariano Rivera?
2008: 70.2 IP, 1.40 ERA, 2.03 FIP, 77 K, 6 BB, 4 HR, 41 H
2009: 61.1 IP, 1.91 ERA, 2.97 FIP, 67 K, 12 BB, 7 HR, 44 H
The difference in ER and FIP is entirely explained by the difference in HR allowed. In fact, his fly ball percentage this year has been even lower than last.
I find the consistency of all of these players to be strange and remarkable.
I must confess, I am unsure what conclusions we can draw from these fractionally dimension, self-similar ballplayers. What they have in common is that, as you refine the statistical measures, and look beyond simple batting average and ERA, they look even more self-similar.
I suspect that the conclusion we can draw depends on a player's age. If the two consecutive years of similar performance come in the middle of a player's career, can we surmise that we have reached a performance maximum? If it is early in a player's career, can we count it as a bit of evidence suggesting that the previously projected development curve will be stunted? For an older player like Rivera, does it tell us something about his agelessness?
Or ought we simply assume it all to be statistical noise--ephemera of the fact that there are many ball players, and many of them will play at or near their true talent level two years in a row?
Discussion Question of the Day
This should be fun. See if you can find examples of players who are self-similar. Share their stats. I bet you can find better ones than I did. My methods, admittedly, were not scientific.
Also, I'm curious to hear what you think we can learn from this type of anomaly.
14 comments
|
1 recs |
Do you like this story?
Comments
ERA vs. FIP
The difference in ERA (3.56 vs. 4.09) is much more pronounced than the difference in FIP (3.82 vs. 4.06)
by Tommy Bennett on Sep 23, 2009 4:04 PM EDT up reply actions
ADAM DUNN is the perfect model for consistency
YEAR AGE HR BB K OBP SLG OPS+
2004 24 46 108 195 0.388 0.569 146
2005 25 40 114 168 0.387 0.540 141
2006 26 40 112 194 0.365 0.490 114
2007 27 40 101 165 0.386 0.554 136
2008 28 40 122 164 0.386 0.513 129
2009 29 38 108 165 0.408 0.556 150
"What we do in life, echoes in eternity!"
Huh -- Dunn's having a career year offensively, huh?
Who wants to write the “Nats make great signing” post?
Beyond the Boxscore Not a member? Sign up.
If you ignored batting average, perhaps this is true
I would have offered up Pat Burrell before this season.
by Tommy Bennett on Sep 23, 2009 10:44 PM EDT up reply actions
Yeah, the batting average would probably vary more than a normal batter
because its a smaller sample.
by lookatthosetwins on Sep 24, 2009 12:20 AM EDT up reply actions
Pujols
I’ve always thought that one of the greatest aspects of Pujols’ stats is how consistent and predictable he is. Other than IBB, he’s basically the same hitter season-to-season, month-to-month.
Consistently ridiculous.
"What we do in life, echoes in eternity!"
by Justin Bopp on Sep 24, 2009 11:26 AM EDT up reply actions
Tommy - how can you write an article on consistency without putting up one of the posters from despair.com

Jeff Zimmerman - Protecting the world from RBI's and Wins from my mom's guest house.
by Jeff Zimmerman on Sep 24, 2009 11:56 AM EDT reply actions
A fair criticism
Reminds me of http://en.wikipedia.org/wiki/Accuracy_and_precision
by Tommy Bennett on Sep 24, 2009 12:24 PM EDT up reply actions
I've always loved that distinction.
"What we do in life, echoes in eternity!"
by Justin Bopp on Sep 24, 2009 12:51 PM EDT up reply actions
On that note, let's add the entire Royals team for the entire century.
CONSISTENTLY BAD.
:(
"What we do in life, echoes in eternity!"
by Justin Bopp on Sep 24, 2009 12:52 PM EDT up reply actions

by 


















