/cdn.vox-cdn.com/assets/272612/BATTER-1B-PENA.png)
One great tragedy, or triumph depending on who you ask, of American culture over the past three decades has been the growing tendency to launch sequels to any modestly successful release, regardless of artistic merit.
For every Empire Strikes Back, The Dark Knight, and Terminator 2, there are numerous insulting resets like Look Who's Talking Too, Ghostbusters 2, Home Alone 3 (the second one really wasn't that bad, was it?), Bush 43, and the Star Wars prequels. Before you go to sleep tonight, thank the film gods that Bryan Singer didn't attempt Superman Returns Again.
All of this is a long-winded way of saying that sequels usually suck. More to the point, I hope to avoid making DiamondView 2.0 a project that Steve Guttenburg would avoid. Instead of sucking out the soul (Macaulay Culkin) and replacing it with easily dismissed commercial tripe (who was that boy in the third one, anyway?), I will do what I can to make an improvement with asked-for adjustments.
Let's see what you think.
The Problem
The main concern about the previous iteration of the DiamondView evaluation tool seemed to be the impression that area meant value, whereas the point of the visual is to imply a describable shape for positional expectation. So, I didn't "fix" that. And though area is the appropriate way to measure value in most graphs of this sort, I believe it misses the point. Attempting to measure the area of these would be not unlike trying to retcon-in an explanation of Boba Fett as the genetic Adam for every Imperial Trooper ever seen, not that any director still worth his salt would ever attempt that.
Another concern is that the data set for fielding was flawed for two reasons: it wasn't adjusted for position and the use of UZR prevented the effective measurement of catchers. I did fix that.
The last main concern was the question of the value of looking at a glimpse in time, at a single year, when some players were down, some injured; all of which meant that no lasting meaning could be gleaned by its audience, unlike timeless lessons of the importance of family, love, and the ability to throw bricks onto unsuspecting burglars during the holidays.
The Solution
These two concerns were addressed by switching data sets. The original used OBP, ISO, eqBRR, and UZR/150. OBP and ISO are easy enough to find, though one might question where the cutoff should be when attempting to download spreadsheets through our various favorite stat sources. eqBRR, simply put, is/was not easy to deal with because it exists in a separate sheet from OBP and ISO, and UZR was near impossible to use because the data sets I found had a completely different set of players than the ones that included the previous three stats. So one would accurately assume that acquiring the data, combining the data, and making sense of the data is near impossible to do on a league-wide scale while maintaining value to you and I at the same time (you giving me attention and me giving you shiny things).
So, per the wisdom of the crowd, I switched entirely to CHONE. Obviously for our immediate purpose I will be using the 2010 CHONE Projection, courtesy of BaseballProjection.com (thanks Sean). This solves both concerns while making the product much easier and more quickly produced, something we can all benefit from.
The Data, The Method, And the Fun.
The original data set (.xls found HERE), includes 1368 players, many of which may never get meaningful playing time and unnecessarily drag down perceptible differences from our actual subjects (projected starters in the Majors). In the first run, The top 10% in the ON BASE metric included 255 players, more than the number of positional starters in the entire league! Needless to say, I needed to pare down the data to get more separation and thus, visual meaning. The cost, unfortunately, is the inability to know that the Royals' fifth shortstop is in the bottom tier of players in every category. Sorry.
I truncated the data by removing everybody except for catchers that CHONE projected to get less than 350 ABs (ABs, the horror!). For catchers, I nixed everyone under 300 ABs. The result is still 794 players, just enough to cover everyone's backup, aging veterans with limited PT, quite a few potential call ups, and the majority of free agents worth one last sequel.
The method is to use the following stats:
ON BASE: OBP projected by CHONE.
POWER: ISO (SLG-AVG) projected by CHONE.
BASE-RUNNING: SPEED projected by CHONE, described as: "similar to the Bill James method; it excludes the range factor part and is based on stolen bases, triples, runs, and double play avoidance."
FIELDING: DEF projected by CHONE. This stat is based on Sean Smith's (creator of CHONE) TotalZone defensive projection, which includes park factors. With Sean's help and Jeff Zimmerman's advice, I applied a positional adjustment to satisfy both our intellectual curiousity and our highly informed audience. Better yet, it includes catcher ratings, satisfying the biggest hole in DiamondView evaluation, if not my inner need to gloss over the Kendall signing.
Notes
Warning. The result is something that will look significantly different if compared directly to the 2009 DiamondView series started this year (former BtB/current BP contributor Tommy Bennett should get credit for his quality authorship of the first two). The number of additional players has given starters a 25% bump in projected value in OBP alone which is ok if we can reset our expectations. The problem with this, however, is that I will only ever be previewing 8 or 9 players per team and the occasional prospect. At most, you guys will see less than a third of all of the players included.
The result is a somewhat diminished visual effect where most starters are 75 or above in either ON BASE or POWER or both. A greater range will be seen in fielding and base running, but still less difference among starters than I had hoped. One solution to this would be for me to manually sort the list and single out the projected starters for every team (254 players) and leave in prominent free agents. One might note a side benefit of the additional players is the positional averages included as a transparent overlay will be more reflective of replacement level, if not exact.
Until some overwhelming outcry of disappointment occurs, I'll probably leave the list at 794 and we'll re-learn to understand what each percentile really means. To help with that, I made some examples of a few starters around the league.
Please note that CHONE does not differentiate between LF or RF and the resultant is a combined OF category. CF remains separate.
Here's a preview of the results:
Compare Carlos Pena (and Adam Dunn, below) to the positional average underneath and notice that they're good at what they do (hitting the ball) while being worse than replacement level defense at first.
CHONE projects ol' Donkey to be a better fielder than Pena? That makes me smile.
Perhaps using Michael Bourn was poor taste in the same article I argue that sequels suck. Regardless, this kid can run.
Franklin Guiterrez's DiamondView fits our expectations. 62 | 66 = .327 OBP | .150 ISO, for a little meaning.
Bobby Abreu actually got on base while you were noticing that the projection of Dunn and Pena are nearly identical. For reference, 95 = .368 OBP.
J.J. Hardy is one of my favorite players because he has an unusual balance of skills.
Barring a torch-bearing crowd of criticism, I will be launching team reviews in the next week and will circle around the league throughout the spring. Before that, I want to hear what you guys think of this sequel. While you're at it, try to name one good sequel and one bad sequel with your comments.