Last week I wrote a post about how Jeff Samardzija of the White Sox isn't a huge fan of the data explosion that's occurred in baseball over the past 15-20 years, and in the course of writing it I began wondering about the origins of some of baseball's most hallowed stats. I did some lightweight research on things like batting average but wasn't getting anywhere when my eyes came upon a book I hadn't cracked open in at least ten years, my copy of Total Baseball (volume 8) in all of its 2,688-page, 11-pound glory. Sure enough, the answer I was looking for was right there (Harry Chadwick is credited with inventing batting average on page 953), and since I'd never read the essays in the book, I took the time to start reading the ones on statistics and sabermetrics.
Fascinating reading. Then I chanced across a formula called the Favorite Toy (page 970). I've seen oblique mentions of this in the past but had never seen it explained, but it generates estimates of whether players will reach certain thresholds in career categories. It's a three-step process, and this example shows the odds current players will reach 3,000 career hits:
|Established Hits||(.5 * 2014 Hits) + (.333 * 2013 Hits) + (.167 * 2012 Hits)|
|Age Factor||24-(2014 age * .6)|
|Final Formula||((EH * AF )/ (3000 - Career Hits)) - .5|
Using these steps shows the chances certain current players will reach 3,000 hits:
|Jose Reyes||Blue Jays||31||1773||19%|
|Dustin Pedroia||Red Sox||30||1371||12%|
Beltre clearly has had a career renaissance since joining the Rangers in 2011, which has to cause the Red Sox consternation as they continue to search for their next third baseman. Pablo Sandoval will be the answer going forward, but if they had kept Beltre after 2010 they might not have had to make that expensive acquisition.
The formula isn't perfect but does one very important thing by placing greater emphasis on more recent achievements (this formula uses slightly different inputs). When I posted a version of this chart on Facebook earlier this week, one person commented he was almost positive Albert Pujols wouldn't make it to 3,000. Given the issues he's faced since joining the Angels, it's a valid concern, but if Pujols can match his production of 2014 and stay healthy (and he'll have better luck with the former than the latter), around three more seasons could generate the hits needed to reach 3,000.
The formula also seems to discount the early starts for both Mike Trout and Starlin Castro, mainly by estimating that both will play only around ten more years. An argument can be made that Mike Trout has "only" 572 career hits, but he's still young. Since most baseball age stats use the players' age as of June 30th of a given year (Trout's birthday is August 7th, 1991, so make a note to send him a card then), he was viewed as a 22-year-old player in 2014, and this shows how he ranks for all players in baseball history through age 22--15th-most hits all-time, a pretty good start to a career.
There are three magic numbers that are pervasive, numbers that are widely considered as marks of distinction--3,000 hits, 500 home runs, and 300 pitching wins. Nobody disputes these are significant achievements, but they make sense only if they're relevant to current performance. This chart shows how many players have achieved these thresholds in baseball history:
|8564||500 HR||26||450||37||400||51||Rodriguez (654), Pujols (520), Ortiz (466), Beltre (395)
|8564||3000 H||28||2500||98||2000||278||Rodriguez (2939), Suzuki (2844), Beltre (2604), Pujols (2519)|
|8625||300 W||24||250||47||200||115||Hudson (214), Sabathia (208), Colon (204), Buehrle (199)|
26 players have hit 500+ home runs, out of a total of 8,500 players (approximately--I did the best I could to eliminate pitchers). A total of 37 have hit 450+, and 51 players in baseball history have hit at least 400 homers, around .6 percent of all non-pitchers. The active leaders are in the last column.
Nobody cares that a player hits 500 home runs or amasses 3,000 hits as much as recognizes how rare these achievements are. Sometimes it's too easy to turn these tremendous achievements into barriers for recognition of greatness, a pitfall that shouldn't be made for any number of factors. In my next post, I'll take this one step further and show the changes in season totals for hits, home runs, and games played, as well as discuss a fascinating reason why this is occurring. Stay tuned.
Scott Lindholm lives in Davenport, IA. Follow him on Twitter @ScottLindholm.