It's been a long time since professional baseball was played – 113 days, to be precise – which means that, if you're like me, you're hankering for a fix, something to remind you that the upcoming season isn't that far off. You have basically two choices, the first being photos of meaningless spring training workouts, which can be alright...
...but I find them to be pretty unfulfilling once you get past the cream of the crop.
The second option, and my preferred choice, is poring over newly unveiled projections of your favorite players. It can be a good way to set your expectations for the upcoming season, for individuals or for teams, and it can be a good way to overreact and leap to conclusions. What better way than that to get prepare for baseball?
Often, you've got several choices; see, for example, the above-pictured Bartolo Colon. If you're an optimistic person, you might feel an affinity for PECOTA, which puts him at a 3.71 ERA, but if you're not, you might prefer ZiPS, which has him at a much worse 4.11 ERA. One way to resolve this is to just pick the one you like better, but maybe you're looking to make an informed decision instead.
The aim of this post is to help make that decision possible, by laying out the details of the main projection systems: where they can be found, who runs them, what their basic methodology is, and what differentiates them from the other options. Not all this information is public – they make money off these systems, after all – but what is public can hopefully give a sense for why you might prefer one over the others in a given case.
Marcel is the logical system to start with, as it's the most simple and provides a sort of baseline for other systems to build off. Tom Tango, doyen of the sabermetrician community, came up with the concept in 2004, though the methodology is totally open to the public and hasn't changed since then, so to say he runs the projections is probably overstating it. Marcel just kind of... exists.
They're calculated as follows*. As an example, we'll project Bryce Harper's 2016 home run total (which also requires projecting his 2016 plate appearances). Marcel projections use the last three years of performance, with the most recent weighted more heavily. We first multiply Harper's 2015 HRs by 5, his 2014 HRs by 4, and his 2013 HRs by 3, and add them together. For plate appearances, we multiply the last year by .5, two years ago by .1, and add both of them to 200.
Then, we calculate the leaguewide rate of home runs per plate appearance for each of the previous three years (excluding pitchers), and multiply those by Harper's plate appearances from each year and the annual weights, to get the weighted mean of home runs for someone with Harper's playing time over the last three years.
Now we regress Harper's past performance toward this mean, by setting it to a rate per 1,200 PAs, taking the average of it and Harper's performance, weighted by plate appearances, and calculating a per-plate appearance rate.
Finally we multiply that rate by Harper's projected number of plate appearances, and adjust the projection for age. Marcel assumes (based on evidence from previous studies) that players will improve on their past performance until roughly age-29, at which point they start to decline when compared to their previous three years. For a player under 29, like Harper, we subtract his age in 2016 (24) from 29, multiply that by .006, and increase the projection by that much; for a player over 29, we subtract 29 from their age, multiply that by .003, and decrease the projection by that much. This is almost always a minor shift.
This process can be repeated with essentially any statistic, for pitchers or hitters, using the same basic weighting and regression process. Marcel is very, very simple, meaning it's relatively easy to calculate. It doesn't consider peripherals or minor league performance, but it has proven very difficult to improve upon, even with more data.
Something notable about Marcel is that it projects anyone with zero PAs over the last three seasons (most often rookies) at league average in every category. You can see this by going through the above example, but changing Harper's plate appearances to zero in each year. It might seem that, for that reason, using other projection systems might be a better choice when it comes to rookies, but even on those players, Marcel seems to perform very well.
Because of their open-source nature, the Marcels are usually available in a few places, but you can find them for any year from 1901 to 2015 at baseballheatmaps.com, run by Jeff Zimmerman. Alternately, you can always calculate them yourself! They provide a great baseline, illustrating the basic principles of regression, consideration of past performance, and greater emphasis on recent performance that underlie virtually every other system.
PECOTA is Baseball Prospectus's proprietary projection system, the 2016 iteration of which came out last week. It was created by Nate Silver and debuted in the 2003 BP Annual (available on Amazon for a mere $0.54 plus shipping and handling). The name technically stands for "Player Empirical Comparison and Optimization Test Algorithm," though it's also a reference to Bill Pecota, a late-80's/early-90's journeyman infielder. It's gone through a number of tweaks since its creation, with Nate Silver leaving BP in 2009 and the PECOTA responsibilities since then falling to Clay Davenport (through 2010), Colin Wyers (through 2014), and now Jonathan Judge, Rob McQuown, and Harry Pavlidis.
PECOTA, like Marcel, begins by calculating a baseline for each player using their past performances, with more recent years weighted more heavily. Where PECOTA differs from most other systems is that it then uses that baseline, along with the player's body type, position, and age, to identify various comparison players. The career trajectories of those comparison players are what lead to the forecasts. The closer the comparison player is to the projected player, the more weight that comparison player's career carries.
This means that PECOTA can offer predictions as answers to a wide variety of questions by looking at how the comparison players performed in the past. For example, each player's projection comes with a Breakout Rate, or the estimated likelihood that the player will beat his baseline weighted average from the previous three years by at least 20%, calculated by looking at how often the comparison players broke out in that fashion.
Another neat feature of PECOTA enabled by its unique setup are the percentile forecasts. Because every player has numerous comparable players to choose from, PECOTA can calculate not just the mean performance level of those comps (or the performance level where 50% of comps were worse), but the performance level where 90% of comps were worse, or 10%, or 40%. PECOTA therefore offers a range of possibilities, and its best guess at the downside and upside cases for a given player.
For example, PECOTA is very bearish on Bryce Harper's 2016, projecting him for only 5.1 WARP, exactly half his 2015 WARP of 11.2. His 90th percentile projection is just 7.4 WARP, indicating PECOTA thinks that even the best-case scenario for Harper is a substantial decline from his 2015. You can also imagine two players, both with a mean projection of 4.0 WARP, but one with a 90th percentile of 6.0 and a 10th percentile of 2.0, and one with a 90th percentile of 5.0 and a 10th percentile of 3.0. In that way, PECOTA's percentile forecasts can provide some insight into a player's floor, ceiling, and reliability – something most other systems can't do.
Steamer's history is much different than the other systems. Instead of coming from an established sabermetrician, Steamer is the result of a high school project by Jared Cross, Dash Davidson, and Peter Rosenbloom, a teacher and students, respectively, at Saint Ann's School in Brooklyn in 2008. The system takes its name from Saint Ann's baseball team, the Steamers. It first produced forecasts for the 2009 season, and since then it's been hosted at FanGraphs and Razzball, though they also maintain a Steamer-specific website and Twitter.
Broadly, Steamer's methodology resembles Marcel's more than PECOTA's, though it is substantially more complex. Like Marcel, it uses a weighted average of past performance regressed toward league average, though how much each year is weighted and to what degree they are regressed varies between statistics and is set using regression analysis of past players rather than Marcel's semi-arbitrary and uniform 5/4/3 / 1200 PAs method.
Steamer is fairly stripped-down and doesn't offer answers to the breadth of questions that PECOTA does. That said, in my analysis of the projection systems' accuracy in predicting pitcher and hitter performance in 2015, Steamer did better than its competitors on the whole and on most subgroups and statistics, showing that simplicity is perhaps an advantage. Those analyses were only based on a single year, however, so there's no guarantee they show actual accuracy and not just randomness, or that Steamer should be expected to repeat in 2016. Either way, all of Steamer's forecasts for a given year are freely available, and it is a consistent performer, despite not coming with many bells and whistles.
The final system I'm looking at is ZiPS, created by Dan Szymborski and available at FanGraphs. ZiPS's methodology overlaps with each of the other systems in certain respects, though it has some notable unique features. It was the first projection system (and I believe the only of the systems in this article) to rely heavily on Voros McCracken's breakthrough on Defense Independent Pitching Statistics (DIPS), the concept that pitchers have little to no control over the rate at which balls in play fall in for hits (or "BABIP," batting average on balls in play). The name ZiPS stands for "sZymborski Projection System," but it's also a reference to McCracken's work.
Much like Marcel and Steamer, ZiPS starts by calculating a weighted average of past performance, using four years for most hitters (weighted at 8/5/4/3) and three years for very young or very old hitters and for pitchers. Similar to PECOTA, it then identifies comparable players, though the required similarity is much looser and the pools are much larger as a result. The impact of the careers of individual comps are much smaller, so while the most comparable players are sometimes given alongside a projection (like post-1978 Robin Yount for post-2015 Xander Bogaerts here), it's more for entertainment than analysis. Those groups of comparable players are then used to create estimates of future growth and decline.
For pitchers, ZiPS differs from other systems in its aforementioned reliance on DIPS theory. Pitchers have their BABIP forecast using their individual tendencies (e.g., ground ball rate, the defense behind them, whether they throw a knuckleball or not) and regressed heavily to the average for players with similar tendencies. Those results inform the pitcher's forecasted ERA, so ZiPS's pitcher projections will sometimes look very different from past performance, especially if a player is changing teams to one with a better defense or more friendly park.
It's very possible this piece created more confusion than it resolved. The projection systems have differences, but what those differences mean for their accuracy isn't always clear. Still, hopefully you can now make an informed guess about which to use in a specific case rather than an uninformed guess, and that's an improvement.
* The section on Marcel originally incorrectly stated the way the system projects plate appearances. After a helpful correction from Tom Tango, it was edited, and now has the correct method.
. . .
Henry Druschel is a Contributing Editor at Beyond the Box Score. You can follow him on Twitter at @henrydruschel.