Getting to WHY

Bryce Harper's age may quite meaningful when projecting his future value.... But why?

I'd like to preface this piece by saying, I really enjoy Rany Jazayerli's  work. He one of a handful of prolific baseball writers and I greatly respect his thoughts and candor. This piece is NOT meant to read as a personal attack on Mr. Jazayerli, but rather to use my skepticism of his most recent findings as a platform to discuss the goals of research. You can click to read PART I and PART II but a subscription is required. 

Why?  Arguably the most important inquiry one grapples with prior to the conclusion of one's research is discerning why the results occurred as they did. Typically, the answer lies on a spectrum, bookended by statistical or anecdotal support. Though, sometimes, results can be deceiving and researchers have to seriously consider discounting findings because they can't explain them.

Especially in baseball, the attributes that we test have properties. Strikeouts are never balls in play, groundballs are never homeruns. Take for instance, left handed and right handed pitchers. The reason why scouts will prefer a southpaw over righty with comparable stuff is because lefties are incredibly rare. Surprisingly, just 10% of the world's population is left handed. An attribute's properties can guide researchers towards explaining their findings.

Thursday, Rany Jazayerli published a piece where he set out to answer an interesting question. Jazayerli states, "what I wanted to find out is whether players who were younger than average on draft day tended to return more value than expected." [1].  For this study he placed 17 and 18 year old high school hitters drafted in top 100 picks into five categories based on their exact age.[2].  There will be more on this in a moment but, the findings are quite impressive and result in the following exclamation, "the conclusion is clear: at least as recently as 2003, the baseball industry as a whole massively underrated the importance of age in drafting high school hitters and massively undervalued high school hitters."

From a scouting perspective, the properties of age are well understood, at least anecdotally. Age-Relative-to-League (ARL) is often touted as an indicator of a player's prospect status. A quick illustration: if a league's average age is 23 and a player is 21 his performance is more meaningful than his 25-year-old counterpart. [3]. As I wrote last week, what Mike Stanton accomplished in the National League at 21 would lose its luster if he wasn't several standard deviations below the league average age. But, this is not the context that was studied here.

I've found it difficult to discern the properties of age that can be derived from each category in Jazayerli's study. In the context of his piece,  there are five categories of 17 and 18 year olds distributed by their birthdays with each bucket being separated by approximately eighty to one hundred days. In order to reconcile the findings, we should be able to find a property that distinguishes these categories from one another.[4]. In other words, what  is it about the 80 day time period that causes the difference in value?

After some quick brainstorming here are some, admittedly feeble attempts at an anecdotal explanation of his findings. [5]. Please, don't be shy! Help out with some suggestions in the comments.

  1. Prior to being drafted, high school hitters with more experience develop bad habits. Thus, the younger a draftee is, the less bad habits he has developed, and the more moldable he is by professional development system.
  2. Younger hitters are able to "play down" against weaker competition in high school and on the amateur circuit. Despite being reverse ARL, dominating younger players gives them the confidence they need to succeed.
  3. Scouts don't take into account that younger players' tools are less developed than older players. At their current age, a young player's tools will rapidly catch up to an older player's tools once he enters professional baseball.

Despite my best efforts, I couldn't come up with a logical explanation that wasn't refutable. If the first point is true, then we would see this effect take place with 16-year-old International Free Agents (IFA). By far, IFAs have the largest percentage of experience developed professionally in the minor leagues. [6].  Additionally, being younger doesn't necessarily correlate with experience. A California or Florida draftee in the "very young" bucket may have far more playing experience than a "very old" draftee from the Northeast. Geography, due to climate, plays a large part in the level of competition in the area and the amount of time a player can play. Additionally, a draftee's decision to partake in multiple sports can significantly cut into their pre-draft experience.  As for the "playing down affect," a player's confidence seems to change quite often. Not to mention, if a player is being drafted in the top 100 picks he was, in all likelihood, extremely successful. 

The a variant of the third option  is more than just plausible, but gets quite tangled when considered along with the subsequent paragraph. However, the proposition is still closely tied to the unfair assumption that a younger player - one who is less than 270 days younger - has less physical development and baseball experience that a scout is unaware of. Or in Jazayerli's defense, the scout is aware of it, but he isn't accounting for it enough, hence the inefficiency. Additionally though, it also relies on the presumption that development occurs linearly, which it does not.

Again, clearly I'm having trouble to specifically explain Jazayerli's findings. Though, even if one of the above affects was the driving factor behind the findings, or there is another age property that I've missed at play, another question looms. When does the model break? In part one, Jazayerli goes to show there was a huge difference in value between the  five youngest and oldest hitters from 1965-1996. Approximately, the difference in that study average age is around 270 days, or just about 9 months. In the second piece, he adds, "a six-month [age] difference is meaningful." Yet, his data in that piece shows that the youngest group saw a return above expected return of 24.84% and the next group saw a return of 11.59%. The difference in age? 106 days or three and a half months. At most.

Would two months be significant? What about one? A week? A day? Realistically, the sample size of draftees isn't large enough to determine that. But, my proposal is at least conceptually intriguing. What advantage has a player gained by being born earlier than another? What traits have they garnered?

In all, the idea that there may be advantages to drafting younger prep hitter seems plausible. Jazayerli's study claims to reflect the discovery of a group of players which has been historically undervalued. However, unless the attributes of that group can be isolated and researchers can test for said attributes relationship to young age, there hasn't been a breakthrough discovery. Truly, Jazayerli's study reflects the discovery of a unspecified segment of a group of players which has unspecified undervalued traits. In conclusion the piece suggests, in lieu of discerning what characteristics a subset of the group of young draftees has, we should just assume that these players are all undervalued.

"Starting Them Young" sets a frame work for discussion, but the findings presented lack focus and are would not be reproducible by a Major League Baseball organization in any meaningful way. Correlation without causation. It's important to remember, at its heart, sabrmetrics is the search for objective knowledge about baseball. And throughout that search we must constantly be asking ourselves "Why?" 

JD Sussman is full time law student and co-founder of Bullpen BanterHe can be reached at or via twitter


[1] His method is pretty cool. He created a best fit formula to determine the expected value of each pick, then subtracted their Discounted WARP to get their surplus Discounted WARP.

[2] ""Very Young" players were less than 17 years, 296 days old on draft day; "Young" players were between 17 years, 296 days and 18 years, 38 days; "Average" players were between 18 years, 38 days and 18 years, 120 days; "Old" players were between 18 years, 120 days and 18 years, 200 days; "Very Old" players were more than 18 years, 200 days old."

[3] I can talk about misconceptions of ARL all day, but now is not the time or place. Additional properties of age, is that being young also allows for additional development time professional development time before reaching peak. Though, my point here isn't to discuss the "out-of-context" important of age, just to highlight that age is significant.

[4] If he was comparing high school hitters to college hitters as Bill James did, this would be a non-issue. There is a clear difference between the two. The college development system vs. developing in the minor leagues.

[5] To be honest, I forced a lot of them. Logically, I don't think these work. But they were important to illustrate that I gave this thought.

[6] An interesting topic to study.... I'm not saying this doesn't happen.

