clock menu more-arrow no yes mobile

Filed under:

Sabermetrics and Diminishing Returns

Some have said the Golden age of sabermetrics has passed. What does that mean for the future of baseball research and how we understand the game?


While many would suggest baseball has dramatically changed since its inception, has it? There have been rule modifications like lowering the mound, the DH, and drug testing, but fans from the early 1900's would still be able to watch and enjoy a modern baseball game with no real problems. In terms of evaluating and studying the game, we have also made significant progress, yet measures like batting average and wins are still tracked and very relevant to the current time period.

Still, we have a much firmer grasp on the game we love than we did 10 or even 5 years ago, and each day we are learning more. But what's the next step; what's the future of sabermetrics?

Often, the best place to start with a question like that is to take a look back at the past. Jack Moore did just that earlier this week.

Here, in this question, lies the foundation of sabermetric thought. Baseball demands numbers. No fandom looks to its statistical history with more frequency or reverence than baseball's, and no sport has a statistical record as clean or as robust as baseball's. Data demands analysis, and thus the early statistics like batting average and earned run average were born. Lane's question is specific, but it alludes to deeper, more primal concerns: Do our measurements describe what happens on the field? Are we closer to understanding how baseball teams score runs, get outs, and win games?

Luckily, baseball has some great forefathers who have taken on the task of tearing down conventional wisdom and finding real answers to questions. Men like Bill James, Pete Palmer, Clay Davenport, and Tom Tango helped create and inspire many measures and statistics to better evaluate past performance and predict future performance. More recently the wonders of technology have allowed a small army to take on some big baseball topics.

The internet revolution-and the nerd revolution it brought on-has created spaces for niche communities of people who would otherwise be too isolated and restricted for those communities to exist. The relatively small group of people who saw a James Baseball Abstract or Palmer's Hidden Game of Baseball as among the most influential literature of their lifetimes finally had a robust space to unite and discuss their ideas.

From that so-called internet revolution has come some fantastic work. Two prime examples include MIke Fast developing new ways to measure pitch framing and Dan Brooks' (and others) work with PITCHf/x, but there has certainly be a lot more great research and commentary that has entered the community.

Over time, the army of new sabermaticians have helped to refine and improve the metrics we use. OPS has become wOBA and tAV, fielding percentage has become UZR and DRS, and stolen bases have become speed score. Really we can look no further than the creation (and subsequent changes) of WAR as a sign of how far we have come in recent years.

Despite all the good, eye-opening developments have all but stalled in recent times as Moore points out.

For all the progress since "Why the System of Batting Averages Should Be Changed," Lane's assessment of the relative state of batting, fielding, and pitching analysis still rings true today. Batting statistics are the most accurate-least debated, certainly-of the set. Pitching and defense-and the question of how to separate the two-remain somewhere between murky and incomprehensible.

Since Baseball Prospectus and FanGraphs made Wins Above Replacement publicly available in the middle of the decade, though, there has been all of one radical, game-changing sabermetric discovery: the notion of catchers impacting the game with pitch-framing.

As Moore points out, this breakthrough has not been small. It has suggested that some of the top catchers do as much for their teams by framing as Ryan Braun and Miguel Cabrera do for their teams at the plate, an amazing conclusion when you really think about it.

Unfortunately other than catcher framing, the sabermetric community has been without a breakthrough for some time. Many, including Moore, have suggested that much of that has to do with the proprietary nature of new data. PITCHf/x, for example, is freely available in the public domain and therefore, hundreds (possibly thousands) of bright baseball minds have been able to work on the data. But now, teams have begun to pick off our best thinkers and new data like HITf/x and FIELDf/x will not be available for public use. At this year's Sloan Analytics conference Nate Silver offered the opinion that we are approaching a period of diminishing returns in terms of our understanding of the game - have we reached the point? And must we wait for better data for any true breakthroughs?

There will be other Jameses and other Palmers-of this I have no doubt. The majority of our research, though, will fall behind the work of the teams; locked out from their information for the foreseeable future, it will be impossible to maintain pace. So the crisis, for outsiders, is not how to stay at the forefront of baseball research. The crisis for outsiders is keeping an audience even when we no longer possess the utmost authority on baseball research-whether or not this authority was deserved, we did have it, for a fleeting moment in baseball history.

Despite enjoying and agreeing with Moore throughout his article, I disagree with this section on several levels. First of all, I have always been of the belief that those that work for teams are the often the first to develop new information in baseball research. Not only are they often some of the most intelligent people (hence why they work in Baseball Operations for big league teams in the first place), but they also have better information than the public does, from medical records to scouting reports, to information on player makeup. On the flip side however, if Moore, or anyone else for that matter thought that the public domain was leading the charge than I see no real reason that would not continue. Just by pure brute force, the internet community is significantly larger than the sum of those working for teams, and I think that will always be a distinct advantage.

Truthfully I think the answer to the question of diminishing returns lies in what we consider a breakthrough. While recently we haven't seen anything as earth-shattering as say DIPS theory, every day, or at the very list every week, there is some fantastic research published on the internet. In the last few months alone, BtBS has published research pieces on how many top prospects turn into top big leaguers, predicting strikeouts, and pitcher similarity scores, and that's just on one site made up of 20 or so writers. So has sabermetric research reached a plateau? I certainly don't think so. As long as there are people out there asking good questions, and never settling for partial answers, we will always continue to learn.

. . .

Andrew Ball is a writer for Beyond the Box Score, Fake Teams, and Fantasy Ninjas.

You can follow him on twitter @Andrew_Ball.