## I'm just projecting

(I originally posted this at Twinkietown, but I thought the subject matter would be interesting to all ya’ll here.  Btw, I should give some props to my good friend Erik who helped me whenever the statistics was getting over my head.  He also convinced me it was a good idea to be a big nerd about baseball stats, although I’m not sure I should thank him for that.)

So I did some pitcher projections (spreadsheet link here).  A little project that turned into a little obsession.

I projected every pitcher with at least 50 IP in the last 3 years.  Maybe I’ll get around to doing the rest, but so far I haven’t had the initiative to dig up minor league stats, and frankly, projecting those guys is a crap shoot anyway.

Actually, all pitcher projections are sort of a crap shoot.  They’re  probably worse than you think, as I try to show below.  If you’re not interested in that, skip ahead to Part II where I get into the nuts and bolts of my projection system: YAPS!

PART I—REFLECTIONS ON PROJECTIONS (WITH DIGRESSIONS ON REGRESSIONS)

I started on this little adventure wondering how hard it was to approximate the results of the major systems’ projections using a pretty low-IQ approach.  With hitting it’s shockingly easy.  For hitters, and especially for hitters with a modicum of MLB experience, the projection systems are fun and all, but they basically provide no added value over a really simple, intuitive projection system—the sort of thing you’d do in your head in about 10 seconds looking at the back of a baseball card.  Tom Tango’s MARCEL is designed to be exactly this sort of "dumb" system, and after reading this post by Tango, it’s hard to avoid the conclusion that none of the "smart" systems really provide meaningful separation from MARCEL when projecting hitters.  Here are Tango’s results for the average error the systems produced on wOBA, a measure of overall offensive performance:

0.0272 Chone
0.0273 Oliver
0.0277 Zips
0.0278 Marcel
0.0280 Pecota

You’ll notice that there isn’t that much action in those numbers at all until you get to the fourth decimal place.  Nobody in normal usage, as far as I know, rounds wOBA or any other baseball stat to the fourth decimal place.  Comparables, BABIP analysis, park factors, and all that other fancy jazz they throw into the stew—it gives you varied and maybe "funner" projections, but when it comes to actually predicting the future, the value added is microscopic to nonexistent.

Is it any different for pitchers?  Let’s examine a group that’s relatively to project:  established MLB starters, defined as having at least 300 IP as a starter over the previous 3 years.  For my convenience, we’re just going to look at predicting 2009 and 2010 results.  I’m also going to limit the data to pitchers who ended up pitching at least 65 innings in the year we’re trying to predict, since we’re not really interested in explaining the oddities of pitcher ERAs in small sample sizes.  (This also adds selection bias and I’m fine with that—the goal here is to predict the ERAs of pitchers that actually end up pitching in MLB, not how well pitchers would have pitched if they had been in MLB.)  It gives us a group of 164 pitcher seasons.

Here is a comparison of various methods of predicting pitcher ERA.  They are measured by average error—basically how far off the mark the ERA projections are on average.  You might be surprised at how wrong they tend to be.

Previous year’s ERA: 0.91

3-year weighted ERA with 4,3,2 weighting: 0.90

PECOTA: 0.89

MARCEL: 0.89

CHONE: 0.88

ZIPS: 0.87

(Some systems have several versions that all come out before the start of the year and in those cases I tried to get the latest version I could find.  Some error on my part in assessing the systems is also entirely possible.)

In terms of "R squared"—basically how much variance in player ERA the system explained with its projections—the systems range from about 0.10 for previous-year ERA to 0.17 for ZIPS.  So a pitcher’s previous year’s ERA explains about 10% of the variance in their ERA for the following year while MARCEL explains about 15%, and ZIPS 17%.

This is not a very high degree of precision folks.  And no one is really blowing MARCEL and the basic ERA predictions out of the water.  If you guessed that every one of these pitchers would simply match their previous year’s ERA you’d probably be labeled a complete baseball-stat knuckle dragger, but on average, you’d only be 0.04 ERA wrong-er than the best of the these projection systems, ZIPS.  You’d probably have a worse projection than ZIPS for 55% of pitchers or so, but hell, for the rest you’d actually be closer!

It’s something to think about whenever you get your lunch money taken by a stat-head bully.  As far as I can see, the sabermetric revolution just hasn’t revolutionized our ability to predict the baseball future.  MARCEL is pretty close to the leaders and MARCEL is supposed to be stupid.  It’s basically the 4,3,2 weighted-ERA method but with some added regression to the mean and an age adjustment.  It’s not designed to be a good system.  But the systems that are designed to be good are only ever-so-slightly better...and occasionally worse!

Another thing that strikes me:  You can throw out all sorts of very different projections and still come up with similar results on average.  Just look at the variety in individual projections for ZIPS, CHONE, MARCEL, PECOTA, or previous-year’s ERA for that matter.  CHONE hated Jon Garland in 2010.  ZIPS liked him.  ZIPS wasn’t really sold on Jon Lester in 2010.  CHONE was.  CHONE though Kevin Millwood was garbage in 2009 and ZIPS thought he was better than average.  And on and on.  They miss or score on individuals, but on average, they all end up about equally right (and wrong).  Heck, you might as well shop around for the system that paints your favorite players in a positive light—it’s probably almost exactly as good as the systems that don’t give them proper respect.

My system does not alter these overall insights, but I humbly suspect it might be a little better than the rest.  I was able to get to 0.85 error and an R squared of 0.21 for the same starters over the same time span.  I tried to make it a "fair fight."  I developed my model using only pre-2009 data so as to be on equal footing with the other systems and then compared my 2009 and 2010 "predictions" to the others.  I didn’t incorporate anyone else’s projections or projection averages into my model.  It’s all from basic stats available on Fangraphs and Baseball Prospectus.  Unless I made an error somewhere—which I should emphasize is a not-entirely unlikely possibility given all the data manipulation I did—I think I beat them more or less fair and square.

My results for the established-starter group were exciting enough (to me at least) that I went ahead and did all pitchers with at least 50 IP over the last three years.  I was generally able to get similar separation from ZIPS in other cohorts.

But this sort of talk is cheap.  Anyone can claim they shuffled numbers around until they beat projections of years past.  Part of posting this is that I wanted my numbers to be out there so I could take on the other systems on a truly even playing field.  I might get my hat handed to me by the pros and semi-pros out there, but frankly, I don’t expect to.  If I’m successful this year, maybe I’ll expand this to a full projection next offseason.

PART II—THE "YAPS" PROJECTIONS

Every system needs a catchy acronym, so mine’s "YAPS" for Yet Another Projection System.  And even more than an acronym, every system needs a catchy gimmick or two.  (Coke and Pepsi will tell you that when you have a product that doesn’t really differentiate itself much on quality, you better pay attention to marketing.)  I’ve got three.

The scouting report.  Maybe YAPS’s coolest gimmick is a "scouting report" for each pitcher it projects.  Here’s a sample for the Twins’ starters:

 NAME Neutral ERA ERA Error stuff control contact Francisco Liriano 3.32 3.49 0.87 65 51 61 Scott Baker 3.81 3.98 0.87 59 63 45 Brian Duensing 4.17 4.33 0.97 47 64 59 Carl Pavano 4.25 4.43 0.87 42 70 44 Kevin Slowey 4.26 4.44 0.87 50 71 34 Nick Blackburn 4.63 4.80 0.87 32 66 40

The scouting report is the last three columns.  "Stuff" is basically the ability to miss bats and get strikeouts.  "Control" is avoiding walks.  "Contact" then is a multifaceted measure of getting good results when the batter actually makes contact.

Based on the sum of the factors that go into each scouting category and their impact on YAPS’s ERA projection, I do a calculation that produces a grade on the 20-80 scouting scale.  Fifty is average and each 10 points up or down represents one standard deviation from the mean.  (Unlike the true scouting scale, YAPS occasionally will grade someone as more than three standard deviations away from the mean; Carlos MarmolDontrelle Willis, I’m looking at you.)

For each pitcher, the relevant average and standard deviation are for pitchers with similar MLB experience (e.g., all pitchers with 300+ IP in the last three years are compared to each other, but not to, say, relief pitchers).  There is also an adjustment that makes the scouting report team- and league-neutral, so you can compare pitchers on a basically even playing field.

There are elements that go into YAPS that aren’t represented in the scouting report—age adjustment, for instance—but the scouting report gives you a good look at the main things that are driving it.  The three components are not given equal weight, but none of the three overwhelms the others in impact.  Generally, stuff is the most important followed by contact and then control; but again, all are three are important for all groups of pitchers.

Error.  Another feature is an "error" column that gives an estimate of the sort of standard error you should expect.  On average, YAPS will be about as wrong as that error number.  Error basically relates to MLB experience—more experience, a better projection.  If you look at the error numbers, it might seem like YAPS is fairly unsure of itself, but it’s basically just being transparent.  I’m pretty sure no other system is doing much better.

Neutral ERA.  Finally, YAPS gives a "Neutral ERA."  Neutral ERA is what the ERA projection would be if everyone played for the same team.  It’s a good way to compare pitchers who play for different teams on a talent basis.  Take a look at:

 NAME: Team: ME: Neutral ERA: Brian Duensing twins 4.33 4.17 Brian Matusz orioles 4.32 3.99 R.A. Dickey mets 4.24 4.27 Vicente Padilla dodgers 4.26 4.39

The Neutral ERA numbers tell you that YAPS thinks Duensing and Matusz are better pitchers than Dickey and Padilla, but it doesn’t think they’ll put up better ERAs given the teams they play for.  Kind of cool, eh?

Another thing to keep in mind when comparing pitchers on a talent basis: YAPS doesn’t know anything about 2011 depth charts.  It will more or less assume that starters will keep starting and relievers will keep relieving and pitchers with a mixed record of starting and relieving will keep doing the same.  For example, YAPS thinks Phil Coke will put up a 3.89 ERA, but it has no idea the Tigers are thinking of making him a starter.

My projections.  Here, first, are YAPS’s Top 26 (not 25, since that wouldn’t get Scott Baker in there) "established" starting pitchers by neutral ERA:

 NAME Team Neutral ERA ERA Error stuff control contact Felix Hernandez SEA 2.97 3.19 0.87 60 56 68 Roy Halladay PHI 3.10 2.99 0.87 58 76 62 Josh Johnson FLO 3.10 3.04 0.87 65 53 67 Clayton Kershaw LAN 3.11 2.98 0.87 67 32 73 Justin Verlander DET 3.12 3.27 0.87 67 54 64 Cliff Lee PHI 3.28 3.18 0.87 55 77 58 Jon Lester BOS 3.30 3.56 0.87 66 48 63 Francisco Liriano MIN 3.32 3.49 0.87 65 51 61 Jered Weaver ANA 3.34 3.50 0.87 75 60 47 Zack Greinke MIL 3.36 3.13 0.87 56 62 60 Ricky Romero TOR 3.37 3.70 0.87 53 45 68 David Price TBA 3.38 3.64 0.87 66 48 57 Clay Buchholz BOS 3.40 3.66 0.87 59 46 63 Adam Wainwright SLN 3.49 3.28 0.87 53 53 64 John Danks CHA 3.49 3.62 0.87 58 51 57 Ubaldo Jimenez COL 3.52 3.30 0.87 57 32 72 Tim Lincecum SFN 3.53 3.38 0.87 64 42 61 Hiroki Kuroda LAN 3.58 3.45 0.87 49 58 66 CC Sabathia NYA 3.63 3.82 0.87 58 52 61 Cole Hamels PHI 3.70 3.60 0.87 66 53 47 Max Scherzer DET 3.73 3.88 0.87 65 44 50 Chad Billingsley LAN 3.74 3.60 0.87 51 39 64 Johan Santana NYN 3.74 3.71 0.87 57 55 56 Dallas Braden OAK 3.75 3.89 0.87 48 62 57 Tommy Hanson ATL 3.80 3.70 0.87 52 49 54 Scott Baker MIN 3.81 3.98 0.87 59 63 45

Maybe not a lot of real shockers there.  YAPS likes Liriano and Romero a good deal more than ZIPS and PECOTA and similarly dislikes Sabathia and Lincecum.  This group of pitchers is the easiest to project that I looked at, but still, the error is considerable.  YAPS thinks there’s about a 50% chance that Cliff Lee’s ERA will be between 2.31 and 4.05, but then, there’s an equal chance it won’t be.  Again, the idea is to be honest about the fact that projecting pitchers is a really dicey affair.

Now for the not-so-established starters.  These guys are both harder to project and also produce more variation between projection systems.

 NAME Team Neutral ERA ERA Error stuff control contact Brandon Morrow TOR 3.28 3.59 0.97 81 45 49 Joba Chamberlain NYA 3.35 3.53 0.97 68 54 58 Marc Rzepczynski TOR 3.38 3.69 0.97 63 49 67 Jhoulys Chacin COL 3.53 3.32 0.97 66 32 69 Brett Anderson OAK 3.57 3.71 0.97 49 65 66 Tim Stauffer SDN 3.63 3.52 0.97 53 52 73 Mat Latos SDN 3.63 3.53 0.97 67 50 52 Madison Bumgarner SFN 3.81 3.66 0.97 53 64 53 Chad Gaudin nya 3.84 4.02 0.97 60 52 55 Jose Contreras PHI 3.84 3.74 0.97 59 56 58 Jaime Garcia SLN 3.85 3.66 0.97 52 40 70 Phil Hughes NYA 3.89 4.07 0.97 63 55 43 Derek Holland TEX 3.91 3.85 0.97 59 54 49 Brett Cecil TOR 3.91 4.22 0.97 52 60 54 C.J. Wilson TEX 3.93 3.88 0.97 57 38 69 Jonathon Niese NYN 3.96 3.94 0.97 54 50 56 Kris Medlen ATL 3.98 3.89 0.97 55 53 51 Brian Matusz BAL 3.99 4.32 0.97 59 57 43 Jordan Zimmermann WAS 4.01 3.79 0.97 60 50 46 Daniel Hudson ARI 4.06 3.88 0.97 61 48 43 Ian Kennedy ARI 4.06 3.88 0.97 59 47 47

There are a lot of starters in this group that YAPS thinks are OK or even very good, but other systems just hate: Rzepczynski, Chacin, Dontrelle Willis (4.59 ERA by YAPS, although he’s an odd duck with a control score of just 11!), Dana Eveland (4.37), Carlos Silva (4.26), and Glen Perkins (4.70), just for example.  A lot of that is disagreement on talent-level, but YAPS might also be taking advantage of selection bias more than other systems—if they’re really bad, they just won’t play.*   YAPS also tends to think that marginal starters they may end relieving, which would tend to deflate their ERAs.

I should point out that I don’t know exactly what the other projection systems are trying to project: what a pitcher would do if they played in MLB or MLB stats given that the player is good enough to get MLB playing time.  My goal is to project MLB stats.  YAPS effectively assumes that the players it’s projecting are good enough to play in MLB and this is exactly as I intend it.  Other systems may be aiming at a different goal and if so, maybe comparing them isn’t fair.  But you also have to ask, if they aren’t trying to project MLB stats, how can you ever assess whether they made good projections?

And here are the top relievers (regardless of MLB experience):

 NAME Team Neutral ERA ERA Error stuff control contact Carlos Marmol CHN 1.99 1.86 1.15 72 20 90 Mariano Rivera NYA 2.14 2.25 1.15 63 73 72 Matt Thornton CHA 2.47 2.55 1.15 58 58 63 Rafael Betancourt COL 2.72 2.59 1.15 61 68 45 Joakim Soria KCA 2.74 2.91 1.15 57 64 57 Joel Hanrahan PIT 2.78 2.72 1.15 72 50 50 Randy Choate flo 2.84 2.80 1.23 66 68 79 Jonathan Papelbon BOS 2.89 3.05 1.15 67 55 53 Francisco Rodriguez NYN 2.91 2.89 1.15 55 45 68 Carlos Villanueva tor 2.93 3.13 1.15 51 51 40 Darren Oliver TEX 2.94 2.91 1.15 67 61 50 Jonathan Broxton LAN 2.95 2.86 1.15 67 47 51 Hong-Chih Kuo LAN 2.96 2.93 1.31 70 62 56 Brian Wilson SFN 2.96 2.87 1.15 60 52 55 Mike Adams SDN 3.03 3.01 1.31 65 60 59 Heath Bell SDN 3.05 2.98 1.15 67 50 49 Scott Downs ana 3.10 3.20 1.15 63 62 54 Billy Wagner ATL 3.10 3.08 1.31 68 57 51 Andrew Bailey OAK 3.11 3.14 1.31 65 65 55 Takashi Saito mil 3.12 3.08 1.31 68 59 51 Clay Hensley FLO 3.12 3.11 1.31 58 47 67 Nick Masset CIN 3.14 3.04 1.15 66 48 52 David Aardsma SEA 3.16 3.20 1.31 63 46 65 Jose Valverde DET 3.17 3.26 1.15 55 43 65 Tyler Clippard WAS 3.17 3.13 1.31 78 43 35 Ryan Madson PHI 3.19 3.12 1.15 47 62 53 Joe Nathan MIN 3.19 3.23 1.31 68 67 48 Sergio Romo SFN 3.20 3.17 1.31 67 65 44 Joe Thatcher SDN 3.22 3.20 1.31 62 70 56 Luke Gregerson SDN 3.23 3.20 1.31 60 62 56 Koji Uehara BAL 3.25 3.50 1.23 76 79 57

Some interesting stuff here.  Carlos Marmol is breaking the system.  He’s literally off the scouting scale for contact, nearly off it for control, and he’s over 2 standard deviations over the mean on stuff.  The end result is that YAPS really likes him—basically his only worry is walking in runs.  YAPS is a lot higher on a lot of these guys than PECOTA and ZIPS—Marmol, Choate, Hanrahan, Hensley, and K-Rod, for example.  On the other hand, it’s not a huge fan of our top Twins, with both Nathan and Capps (ERA 4.33) scoring a good deal lower than ZIPS and PECOTA grade them.

By R squared, YAPS is actually less like ZIPS and PECOTA for 2011 than they are like each other, although they’re not really very much like each other either.  For starters, the average ERA difference ranges from 0.34 to 0.40 when comparing ZIPS, PECOTA, and YAPS.  Again, you can throw very different projections out there and end up with similar results.

I’m very interested in your feedback, especially since this is the first time I’ve done something like this.  Maybe if I feel up to it I’ll update YAPS throughout the season.

One final look: all starters and experienced relievers who scored at least a 70 on one of the three scouting categories and their counterparts on the unhappy side of the scouting scale:

 NAME Team Neutral ERA ERA Error stuff control contact Brandon Morrow TOR 3.28 3.59 0.97 81 45 49 Jered Weaver ANA 3.34 3.50 0.87 75 60 47 Joel Hanrahan PIT 2.78 2.72 1.15 72 50 50 Carlos Marmol CHN 1.99 1.86 1.15 72 20 90 NAME Team Neutral ERA ERA Error stuff control contact Cliff Lee PHI 3.28 3.18 0.87 55 77 58 Roy Halladay PHI 3.10 2.99 0.87 58 76 62 Edward Mujica flo 3.50 3.46 1.15 49 74 31 Mariano Rivera NYA 2.14 2.25 1.15 63 73 72 Douglas Fister SEA 4.24 4.46 0.97 41 71 52 Kevin Slowey MIN 4.26 4.44 0.87 50 71 34 Carl Pavano MIN 4.25 4.43 0.87 42 70 44 NAME Team Neutral ERA ERA Error stuff control contact Carlos Marmol CHN 1.99 1.86 1.15 72 20 90 Tim Stauffer SDN 3.63 3.52 0.97 53 52 73 Clayton Kershaw LAN 3.11 2.98 0.87 67 32 73 Mariano Rivera NYA 2.14 2.25 1.15 63 73 72 Ubaldo Jimenez COL 3.52 3.30 0.87 57 32 72 Jake Westbrook SLN 4.33 4.13 0.97 40 48 70 Jaime Garcia SLN 3.85 3.66 0.97 52 40 70

 NAME Team Neutral ERA ERA Error stuff control contact Jeff Suppan sfn 5.74 5.59 0.87 27 36 34 Kyle Kendrick PHI 4.99 4.89 0.87 28 53 36 NAME Team Neutral ERA ERA Error stuff control contact Dontrelle Willis cin 4.74 4.59 0.97 44 11 69 Oliver Perez NYN 5.16 5.13 0.97 53 16 37 Carlos Marmol CHN 1.99 1.86 1.15 72 20 90 Edinson Volquez CIN 3.99 3.84 0.87 66 21 60 Rich Harden oak 4.59 4.74 0.87 68 24 39 Jonathan Sanchez SFN 4.36 4.21 0.87 63 24 48 Doug Davis MIL 5.15 4.93 0.87 41 27 46 Carlos Zambrano CHN 4.33 4.13 0.87 48 27 64 Manny Parra MIL 4.98 4.75 0.87 54 28 40 Sean Gallagher pit 4.60 4.52 0.97 50 28 50 Todd Wellemeyer chn 5.50 5.29 0.87 40 29 36 Jorge De La Rosa COL 4.55 4.33 0.87 56 30 51 NAME Team Neutral ERA ERA Error stuff control contact Aaron Harang sdn 5.11 5.00 0.87 48 51 28 Chad Qualls sdn 4.21 4.14 1.15 53 61 28 Dave Bush tex 5.43 5.38 0.87 38 45 28 David Hernandez ari 4.47 4.29 0.97 60 46 30 Ted Lilly LAN 4.71 4.58 0.87 55 54 30 Matt Capps MIN 4.23 4.33 1.15 46 65 30

## Trending Discussions

forgot?

As part of the new SB Nation launch, prior users will need to choose a permanent username, along with a new password.

I already have a Vox Media account!

### Verify Vox Media account

As part of the new SB Nation launch, prior MT authors will need to choose a new username and password.

We'll email you a reset link.

Try another email?

### Almost done,

By becoming a registered user, you are also agreeing to our Terms and confirming that you have read our Privacy Policy.

### Join Beyond the Box Score

You must be a member of Beyond the Box Score to participate.

We have our own Community Guidelines at Beyond the Box Score. You should read them.

### Join Beyond the Box Score

You must be a member of Beyond the Box Score to participate.

We have our own Community Guidelines at Beyond the Box Score. You should read them.