FanPost

I'm just projecting

(I originally posted this at Twinkietown, but I thought the subject matter would be interesting to all ya’ll here.  Btw, I should give some props to my good friend Erik who helped me whenever the statistics was getting over my head.  He also convinced me it was a good idea to be a big nerd about baseball stats, although I’m not sure I should thank him for that.)

So I did some pitcher projections (spreadsheet link here).  A little project that turned into a little obsession.

I projected every pitcher with at least 50 IP in the last 3 years.  Maybe I’ll get around to doing the rest, but so far I haven’t had the initiative to dig up minor league stats, and frankly, projecting those guys is a crap shoot anyway.

Actually, all pitcher projections are sort of a crap shoot.  They’re  probably worse than you think, as I try to show below.  If you’re not interested in that, skip ahead to Part II where I get into the nuts and bolts of my projection system: YAPS!

PART I—REFLECTIONS ON PROJECTIONS (WITH DIGRESSIONS ON REGRESSIONS)

I started on this little adventure wondering how hard it was to approximate the results of the major systems’ projections using a pretty low-IQ approach.  With hitting it’s shockingly easy.  For hitters, and especially for hitters with a modicum of MLB experience, the projection systems are fun and all, but they basically provide no added value over a really simple, intuitive projection system—the sort of thing you’d do in your head in about 10 seconds looking at the back of a baseball card.  Tom Tango’s MARCEL is designed to be exactly this sort of "dumb" system, and after reading this post by Tango, it’s hard to avoid the conclusion that none of the "smart" systems really provide meaningful separation from MARCEL when projecting hitters.  Here are Tango’s results for the average error the systems produced on wOBA, a measure of overall offensive performance:

0.0272 Chone 
0.0273 Oliver 
0.0277 Zips 
0.0278 Marcel 
0.0280 Pecota

You’ll notice that there isn’t that much action in those numbers at all until you get to the fourth decimal place.  Nobody in normal usage, as far as I know, rounds wOBA or any other baseball stat to the fourth decimal place.  Comparables, BABIP analysis, park factors, and all that other fancy jazz they throw into the stew—it gives you varied and maybe "funner" projections, but when it comes to actually predicting the future, the value added is microscopic to nonexistent.

Is it any different for pitchers?  Let’s examine a group that’s relatively to project:  established MLB starters, defined as having at least 300 IP as a starter over the previous 3 years.  For my convenience, we’re just going to look at predicting 2009 and 2010 results.  I’m also going to limit the data to pitchers who ended up pitching at least 65 innings in the year we’re trying to predict, since we’re not really interested in explaining the oddities of pitcher ERAs in small sample sizes.  (This also adds selection bias and I’m fine with that—the goal here is to predict the ERAs of pitchers that actually end up pitching in MLB, not how well pitchers would have pitched if they had been in MLB.)  It gives us a group of 164 pitcher seasons.

Here is a comparison of various methods of predicting pitcher ERA.  They are measured by average error—basically how far off the mark the ERA projections are on average.  You might be surprised at how wrong they tend to be.

Previous year’s ERA: 0.91

3-year weighted ERA with 4,3,2 weighting: 0.90

PECOTA: 0.89

MARCEL: 0.89

CHONE: 0.88

ZIPS: 0.87

 

(Some systems have several versions that all come out before the start of the year and in those cases I tried to get the latest version I could find.  Some error on my part in assessing the systems is also entirely possible.)

In terms of "R squared"—basically how much variance in player ERA the system explained with its projections—the systems range from about 0.10 for previous-year ERA to 0.17 for ZIPS.  So a pitcher’s previous year’s ERA explains about 10% of the variance in their ERA for the following year while MARCEL explains about 15%, and ZIPS 17%.

This is not a very high degree of precision folks.  And no one is really blowing MARCEL and the basic ERA predictions out of the water.  If you guessed that every one of these pitchers would simply match their previous year’s ERA you’d probably be labeled a complete baseball-stat knuckle dragger, but on average, you’d only be 0.04 ERA wrong-er than the best of the these projection systems, ZIPS.  You’d probably have a worse projection than ZIPS for 55% of pitchers or so, but hell, for the rest you’d actually be closer!

It’s something to think about whenever you get your lunch money taken by a stat-head bully.  As far as I can see, the sabermetric revolution just hasn’t revolutionized our ability to predict the baseball future.  MARCEL is pretty close to the leaders and MARCEL is supposed to be stupid.  It’s basically the 4,3,2 weighted-ERA method but with some added regression to the mean and an age adjustment.  It’s not designed to be a good system.  But the systems that are designed to be good are only ever-so-slightly better...and occasionally worse!

Another thing that strikes me:  You can throw out all sorts of very different projections and still come up with similar results on average.  Just look at the variety in individual projections for ZIPS, CHONE, MARCEL, PECOTA, or previous-year’s ERA for that matter.  CHONE hated Jon Garland in 2010.  ZIPS liked him.  ZIPS wasn’t really sold on Jon Lester in 2010.  CHONE was.  CHONE though Kevin Millwood was garbage in 2009 and ZIPS thought he was better than average.  And on and on.  They miss or score on individuals, but on average, they all end up about equally right (and wrong).  Heck, you might as well shop around for the system that paints your favorite players in a positive light—it’s probably almost exactly as good as the systems that don’t give them proper respect.

My system does not alter these overall insights, but I humbly suspect it might be a little better than the rest.  I was able to get to 0.85 error and an R squared of 0.21 for the same starters over the same time span.  I tried to make it a "fair fight."  I developed my model using only pre-2009 data so as to be on equal footing with the other systems and then compared my 2009 and 2010 "predictions" to the others.  I didn’t incorporate anyone else’s projections or projection averages into my model.  It’s all from basic stats available on Fangraphs and Baseball Prospectus.  Unless I made an error somewhere—which I should emphasize is a not-entirely unlikely possibility given all the data manipulation I did—I think I beat them more or less fair and square.

My results for the established-starter group were exciting enough (to me at least) that I went ahead and did all pitchers with at least 50 IP over the last three years.  I was generally able to get similar separation from ZIPS in other cohorts.

But this sort of talk is cheap.  Anyone can claim they shuffled numbers around until they beat projections of years past.  Part of posting this is that I wanted my numbers to be out there so I could take on the other systems on a truly even playing field.  I might get my hat handed to me by the pros and semi-pros out there, but frankly, I don’t expect to.  If I’m successful this year, maybe I’ll expand this to a full projection next offseason.

PART II—THE "YAPS" PROJECTIONS

Every system needs a catchy acronym, so mine’s "YAPS" for Yet Another Projection System.  And even more than an acronym, every system needs a catchy gimmick or two.  (Coke and Pepsi will tell you that when you have a product that doesn’t really differentiate itself much on quality, you better pay attention to marketing.)  I’ve got three.

The scouting report.  Maybe YAPS’s coolest gimmick is a "scouting report" for each pitcher it projects.  Here’s a sample for the Twins’ starters:

NAME

Neutral ERA

ERA

Error

stuff

control

contact

Francisco Liriano

3.32

3.49

0.87

65

51

61

Scott Baker

3.81

3.98

0.87

59

63

45

Brian Duensing

4.17

4.33

0.97

47

64

59

Carl Pavano

4.25

4.43

0.87

42

70

44

Kevin Slowey

4.26

4.44

0.87

50

71

34

Nick Blackburn

4.63

4.80

0.87

32

66

40

 

The scouting report is the last three columns.  "Stuff" is basically the ability to miss bats and get strikeouts.  "Control" is avoiding walks.  "Contact" then is a multifaceted measure of getting good results when the batter actually makes contact.

Based on the sum of the factors that go into each scouting category and their impact on YAPS’s ERA projection, I do a calculation that produces a grade on the 20-80 scouting scale.  Fifty is average and each 10 points up or down represents one standard deviation from the mean.  (Unlike the true scouting scale, YAPS occasionally will grade someone as more than three standard deviations away from the mean; Carlos MarmolDontrelle Willis, I’m looking at you.) 

For each pitcher, the relevant average and standard deviation are for pitchers with similar MLB experience (e.g., all pitchers with 300+ IP in the last three years are compared to each other, but not to, say, relief pitchers).  There is also an adjustment that makes the scouting report team- and league-neutral, so you can compare pitchers on a basically even playing field.

There are elements that go into YAPS that aren’t represented in the scouting report—age adjustment, for instance—but the scouting report gives you a good look at the main things that are driving it.  The three components are not given equal weight, but none of the three overwhelms the others in impact.  Generally, stuff is the most important followed by contact and then control; but again, all are three are important for all groups of pitchers.

Error.  Another feature is an "error" column that gives an estimate of the sort of standard error you should expect.  On average, YAPS will be about as wrong as that error number.  Error basically relates to MLB experience—more experience, a better projection.  If you look at the error numbers, it might seem like YAPS is fairly unsure of itself, but it’s basically just being transparent.  I’m pretty sure no other system is doing much better.

Neutral ERA.  Finally, YAPS gives a "Neutral ERA."  Neutral ERA is what the ERA projection would be if everyone played for the same team.  It’s a good way to compare pitchers who play for different teams on a talent basis.  Take a look at:

NAME:

Team:

ME:

Neutral ERA:

Brian Duensing

twins

4.33

4.17

Brian Matusz

orioles

4.32

3.99

R.A. Dickey

mets

4.24

4.27

Vicente Padilla

dodgers

4.26

4.39

 

The Neutral ERA numbers tell you that YAPS thinks Duensing and Matusz are better pitchers than Dickey and Padilla, but it doesn’t think they’ll put up better ERAs given the teams they play for.  Kind of cool, eh?

Another thing to keep in mind when comparing pitchers on a talent basis: YAPS doesn’t know anything about 2011 depth charts.  It will more or less assume that starters will keep starting and relievers will keep relieving and pitchers with a mixed record of starting and relieving will keep doing the same.  For example, YAPS thinks Phil Coke will put up a 3.89 ERA, but it has no idea the Tigers are thinking of making him a starter.

My projections.  Here, first, are YAPS’s Top 26 (not 25, since that wouldn’t get Scott Baker in there) "established" starting pitchers by neutral ERA:

NAME

Team

Neutral ERA

ERA

Error

stuff

control

contact

Felix Hernandez

SEA

2.97

3.19

0.87

60

56

68

Roy Halladay

PHI

3.10

2.99

0.87

58

76

62

Josh Johnson

FLO

3.10

3.04

0.87

65

53

67

Clayton Kershaw

LAN

3.11

2.98

0.87

67

32

73

Justin Verlander

DET

3.12

3.27

0.87

67

54

64

Cliff Lee

PHI

3.28

3.18

0.87

55

77

58

Jon Lester

BOS

3.30

3.56

0.87

66

48

63

Francisco Liriano

MIN

3.32

3.49

0.87

65

51

61

Jered Weaver

ANA

3.34

3.50

0.87

75

60

47

Zack Greinke

MIL

3.36

3.13

0.87

56

62

60

Ricky Romero

TOR

3.37

3.70

0.87

53

45

68

David Price

TBA

3.38

3.64

0.87

66

48

57

Clay Buchholz

BOS

3.40

3.66

0.87

59

46

63

Adam Wainwright

SLN

3.49

3.28

0.87

53

53

64

John Danks

CHA

3.49

3.62

0.87

58

51

57

Ubaldo Jimenez

COL

3.52

3.30

0.87

57

32

72

Tim Lincecum

SFN

3.53

3.38

0.87

64

42

61

Hiroki Kuroda

LAN

3.58

3.45

0.87

49

58

66

CC Sabathia

NYA

3.63

3.82

0.87

58

52

61

Cole Hamels

PHI

3.70

3.60

0.87

66

53

47

Max Scherzer

DET

3.73

3.88

0.87

65

44

50

Chad Billingsley

LAN

3.74

3.60

0.87

51

39

64

Johan Santana

NYN

3.74

3.71

0.87

57

55

56

Dallas Braden

OAK

3.75

3.89

0.87

48

62

57

Tommy Hanson

ATL

3.80

3.70

0.87

52

49

54

Scott Baker

MIN

3.81

3.98

0.87

59

63

45

 

Maybe not a lot of real shockers there.  YAPS likes Liriano and Romero a good deal more than ZIPS and PECOTA and similarly dislikes Sabathia and Lincecum.  This group of pitchers is the easiest to project that I looked at, but still, the error is considerable.  YAPS thinks there’s about a 50% chance that Cliff Lee’s ERA will be between 2.31 and 4.05, but then, there’s an equal chance it won’t be.  Again, the idea is to be honest about the fact that projecting pitchers is a really dicey affair. 

Now for the not-so-established starters.  These guys are both harder to project and also produce more variation between projection systems.

NAME

Team

Neutral ERA

ERA

Error

stuff

control

contact

Brandon Morrow

TOR

3.28

3.59

0.97

81

45

49

Joba Chamberlain

NYA

3.35

3.53

0.97

68

54

58

Marc Rzepczynski

TOR

3.38

3.69

0.97

63

49

67

Jhoulys Chacin

COL

3.53

3.32

0.97

66

32

69

Brett Anderson

OAK

3.57

3.71

0.97

49

65

66

Tim Stauffer

SDN

3.63

3.52

0.97

53

52

73

Mat Latos

SDN

3.63

3.53

0.97

67

50

52

Madison Bumgarner

SFN

3.81

3.66

0.97

53

64

53

Chad Gaudin

nya

3.84

4.02

0.97

60

52

55

Jose Contreras

PHI

3.84

3.74

0.97

59

56

58

Jaime Garcia

SLN

3.85

3.66

0.97

52

40

70

Phil Hughes

NYA

3.89

4.07

0.97

63

55

43

Derek Holland

TEX

3.91

3.85

0.97

59

54

49

Brett Cecil

TOR

3.91

4.22

0.97

52

60

54

C.J. Wilson

TEX

3.93

3.88

0.97

57

38

69

Jonathon Niese

NYN

3.96

3.94

0.97

54

50

56

Kris Medlen

ATL

3.98

3.89

0.97

55

53

51

Brian Matusz

BAL

3.99

4.32

0.97

59

57

43

Jordan Zimmermann

WAS

4.01

3.79

0.97

60

50

46

Daniel Hudson

ARI

4.06

3.88

0.97

61

48

43

Ian Kennedy

ARI

4.06

3.88

0.97

59

47

47

 

There are a lot of starters in this group that YAPS thinks are OK or even very good, but other systems just hate: Rzepczynski, Chacin, Dontrelle Willis (4.59 ERA by YAPS, although he’s an odd duck with a control score of just 11!), Dana Eveland (4.37), Carlos Silva (4.26), and Glen Perkins (4.70), just for example.  A lot of that is disagreement on talent-level, but YAPS might also be taking advantage of selection bias more than other systems—if they’re really bad, they just won’t play.*   YAPS also tends to think that marginal starters they may end relieving, which would tend to deflate their ERAs.

I should point out that I don’t know exactly what the other projection systems are trying to project: what a pitcher would do if they played in MLB or MLB stats given that the player is good enough to get MLB playing time.  My goal is to project MLB stats.  YAPS effectively assumes that the players it’s projecting are good enough to play in MLB and this is exactly as I intend it.  Other systems may be aiming at a different goal and if so, maybe comparing them isn’t fair.  But you also have to ask, if they aren’t trying to project MLB stats, how can you ever assess whether they made good projections?

And here are the top relievers (regardless of MLB experience):

NAME

Team

Neutral ERA

ERA

Error

stuff

control

contact

Carlos Marmol

CHN

1.99

1.86

1.15

72

20

90

Mariano Rivera

NYA

2.14

2.25

1.15

63

73

72

Matt Thornton

CHA

2.47

2.55

1.15

58

58

63

Rafael Betancourt

COL

2.72

2.59

1.15

61

68

45

Joakim Soria

KCA

2.74

2.91

1.15

57

64

57

Joel Hanrahan

PIT

2.78

2.72

1.15

72

50

50

Randy Choate

flo

2.84

2.80

1.23

66

68

79

Jonathan Papelbon

BOS

2.89

3.05

1.15

67

55

53

Francisco Rodriguez

NYN

2.91

2.89

1.15

55

45

68

Carlos Villanueva

tor

2.93

3.13

1.15

51

51

40

Darren Oliver

TEX

2.94

2.91

1.15

67

61

50

Jonathan Broxton

LAN

2.95

2.86

1.15

67

47

51

Hong-Chih Kuo

LAN

2.96

2.93

1.31

70

62

56

Brian Wilson

SFN

2.96

2.87

1.15

60

52

55

Mike Adams

SDN

3.03

3.01

1.31

65

60

59

Heath Bell

SDN

3.05

2.98

1.15

67

50

49

Scott Downs

ana

3.10

3.20

1.15

63

62

54

Billy Wagner

ATL

3.10

3.08

1.31

68

57

51

Andrew Bailey

OAK

3.11

3.14

1.31

65

65

55

Takashi Saito

mil

3.12

3.08

1.31

68

59

51

Clay Hensley

FLO

3.12

3.11

1.31

58

47

67

Nick Masset

CIN

3.14

3.04

1.15

66

48

52

David Aardsma

SEA

3.16

3.20

1.31

63

46

65

Jose Valverde

DET

3.17

3.26

1.15

55

43

65

Tyler Clippard

WAS

3.17

3.13

1.31

78

43

35

Ryan Madson

PHI

3.19

3.12

1.15

47

62

53

Joe Nathan

MIN

3.19

3.23

1.31

68

67

48

Sergio Romo

SFN

3.20

3.17

1.31

67

65

44

Joe Thatcher

SDN

3.22

3.20

1.31

62

70

56

Luke Gregerson

SDN

3.23

3.20

1.31

60

62

56

Koji Uehara

BAL

3.25

3.50

1.23

76

79

57

 

Some interesting stuff here.  Carlos Marmol is breaking the system.  He’s literally off the scouting scale for contact, nearly off it for control, and he’s over 2 standard deviations over the mean on stuff.  The end result is that YAPS really likes him—basically his only worry is walking in runs.  YAPS is a lot higher on a lot of these guys than PECOTA and ZIPS—Marmol, Choate, Hanrahan, Hensley, and K-Rod, for example.  On the other hand, it’s not a huge fan of our top Twins, with both Nathan and Capps (ERA 4.33) scoring a good deal lower than ZIPS and PECOTA grade them.

By R squared, YAPS is actually less like ZIPS and PECOTA for 2011 than they are like each other, although they’re not really very much like each other either.  For starters, the average ERA difference ranges from 0.34 to 0.40 when comparing ZIPS, PECOTA, and YAPS.  Again, you can throw very different projections out there and end up with similar results.

I’m very interested in your feedback, especially since this is the first time I’ve done something like this.  Maybe if I feel up to it I’ll update YAPS throughout the season.

One final look: all starters and experienced relievers who scored at least a 70 on one of the three scouting categories and their counterparts on the unhappy side of the scouting scale:

NAME

Team

Neutral ERA

ERA

Error

stuff

control

contact

Brandon Morrow

TOR

3.28

3.59

0.97

81

45

49

Jered Weaver

ANA

3.34

3.50

0.87

75

60

47

Joel Hanrahan

PIT

2.78

2.72

1.15

72

50

50

Carlos Marmol

CHN

1.99

1.86

1.15

72

20

90

NAME

Team

Neutral ERA

ERA

Error

stuff

control

contact

Cliff Lee

PHI

3.28

3.18

0.87

55

77

58

Roy Halladay

PHI

3.10

2.99

0.87

58

76

62

Edward Mujica

flo

3.50

3.46

1.15

49

74

31

Mariano Rivera

NYA

2.14

2.25

1.15

63

73

72

Douglas Fister

SEA

4.24

4.46

0.97

41

71

52

Kevin Slowey

MIN

4.26

4.44

0.87

50

71

34

Carl Pavano

MIN

4.25

4.43

0.87

42

70

44

NAME

Team

Neutral ERA

ERA

Error

stuff

control

contact

Carlos Marmol

CHN

1.99

1.86

1.15

72

20

90

Tim Stauffer

SDN

3.63

3.52

0.97

53

52

73

Clayton Kershaw

LAN

3.11

2.98

0.87

67

32

73

Mariano Rivera

NYA

2.14

2.25

1.15

63

73

72

Ubaldo Jimenez

COL

3.52

3.30

0.87

57

32

72

Jake Westbrook

SLN

4.33

4.13

0.97

40

48

70

Jaime Garcia

SLN

3.85

3.66

0.97

52

40

70

 

NAME

Team

Neutral ERA

ERA

Error

stuff

control

contact

Jeff Suppan

sfn

5.74

5.59

0.87

27

36

34

Kyle Kendrick

PHI

4.99

4.89

0.87

28

53

36

NAME

Team

Neutral ERA

ERA

Error

stuff

control

contact

Dontrelle Willis

cin

4.74

4.59

0.97

44

11

69

Oliver Perez

NYN

5.16

5.13

0.97

53

16

37

Carlos Marmol

CHN

1.99

1.86

1.15

72

20

90

Edinson Volquez

CIN

3.99

3.84

0.87

66

21

60

Rich Harden

oak

4.59

4.74

0.87

68

24

39

Jonathan Sanchez

SFN

4.36

4.21

0.87

63

24

48

Doug Davis

MIL

5.15

4.93

0.87

41

27

46

Carlos Zambrano

CHN

4.33

4.13

0.87

48

27

64

Manny Parra

MIL

4.98

4.75

0.87

54

28

40

Sean Gallagher

pit

4.60

4.52

0.97

50

28

50

Todd Wellemeyer

chn

5.50

5.29

0.87

40

29

36

Jorge De La Rosa

COL

4.55

4.33

0.87

56

30

51

NAME

Team

Neutral ERA

ERA

Error

stuff

control

contact

Aaron Harang

sdn

5.11

5.00

0.87

48

51

28

Chad Qualls

sdn

4.21

4.14

1.15

53

61

28

Dave Bush

tex

5.43

5.38

0.87

38

45

28

David Hernandez

ari

4.47

4.29

0.97

60

46

30

Ted Lilly

LAN

4.71

4.58

0.87

55

54

30

Matt Capps

MIN

4.23

4.33

1.15

46

65

30

Trending Discussions

X
Log In Sign Up

forgot?
Log In Sign Up

Please choose a new SB Nation username and password

As part of the new SB Nation launch, prior users will need to choose a permanent username, along with a new password.

Your username will be used to login to SB Nation going forward.

I already have a Vox Media account!

Verify Vox Media account

Please login to your Vox Media account. This account will be linked to your previously existing Eater account.

Please choose a new SB Nation username and password

As part of the new SB Nation launch, prior MT authors will need to choose a new username and password.

Your username will be used to login to SB Nation going forward.

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

By becoming a registered user, you are also agreeing to our Terms and confirming that you have read our Privacy Policy.

Join Beyond the Box Score

You must be a member of Beyond the Box Score to participate.

We have our own Community Guidelines at Beyond the Box Score. You should read them.

Join Beyond the Box Score

You must be a member of Beyond the Box Score to participate.

We have our own Community Guidelines at Beyond the Box Score. You should read them.

Spinner.vc97ec6e

Authenticating

Great!

Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.

tracking_pixel_9351_tracker