clock menu more-arrow no yes mobile

Filed under:

BtBs 50 Best of the Next 5 Years - Intro and Methodology

Over the next two weeks we will unveil our (and when I say our I really mean the data's) list of the 50 Best Players of the Next Five Years.  You first question is probably, "How is this going to be any different than Fangraph's Trade Value Series?"  The simple answer is that this list will ignore contractual status.  This list approaches the problem from a "all the contracts in MLB have been ripped up and we're picking teams playground style" angle.  With that premise in mind the goal was to come up with a data driven list rather than an author(s) opinion list.  The particular data in question is 5-year projected WAR using inputs I will outline for everyone now.

For position players I used pre-season CHONE projections to derive wOBA which I then aged over the 5 years using results from MGL's aging study.  For the defensive component I used my own defensive projections when available and CHONE for players that I hadn't projected.  I did not account for position switches over the course of the five years that were projected.  Playing time was the most difficult component to project.  Since the list is meant to be a "playground style list" it made sense to me to give all position players "starter caliber" playing time.  To that end I found the average of the top ten in PA by position over the last two years.  I then averaged each players last two years of PA and regressed towards the positional average.  The modeled PA was the maximum of the positional average and the regressed averaged.  Applying that playing time across all of the WAR components leads to the overall WAR value .

For pitchers I used CHONE's context neutral ERA and an innings pitched projection that mirrors the PA projection above.  The only difference is I looked at the top 80 starters and relievers to get the positional averages.  For aging I leveraged this blog post from MGL.  The cliff notes are that the curve is flat from 21-26 and then goes up 0.2 runs allowed per season.

A couple caveats worth mentioning

  1. The projections are the mean projections and do not address the uncertainty levels.  This is especially important since we are projection a lot of young players where the uncertainty level is going to be very high.
  2. The same aging curves were applied to all players.
  3. The playing time estimations were not aged (i.e. same number of PAs/IPs over the 5 years).
Even with those caveats, I still think the method will generate a pretty solid list that will be well worth discussing, so please stop by and discuss what the computer has spit out.  I doubt anyone (even me) will agree with the list in it's entirety so come back with good points and counterpoints about who should be included and excluded.  The schedule for posts will look something like this

Monday May 31 - Intro and Methodology
Tuesday June 1 - Players 50-41
Wednesday June 2nd - Players 40-31
Thursday June 3rd - Players 30-21
Monday June 7th - Players 20-11
Tuesday June 8th - Players 10-1
Wednesday June 9th - Wrap Up (Interesting tidbits and a full data dump)