Beyond the Box Score: An SB Nation Community

Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Around SBN: The Record of Wrongs: Vanderbilt Commodores

Marcel Projects The 2009 Leader Boards

Colin Wyers has posted a set of Marcel projections, and well, I just had to slice and dice them.  Marcel is Tom Tango's extremely basic forecasting system named after the monkey from Friends.  It sets the bar for what other, more professional forecasting systems should strive to surpass.  The system consists of:

  • A weighted average of the past three seasons stats.
  • An age adjustment.
  • Some regression.

That's it.  No park-adjustments.  No similarity scores.  No DIPS analysis.  No lineup analysis.  No nothing.  Even so, Marcel doesn't finish far behind more advanced systems like CHONE and PECOTA in year-end analyses.

Now on to the fun.  Here's how Colin's Marcels foresee the 2009 MLB leader boards in a number of popular hitting and pitching categories:

Player HR
Howard 37
Dunn 30
Arod 29
Pujols 29
Fielder 29
Braun 29
Pena 28
Dye 27
Thome 27
Teixeira 26
Delgado 26
Manny 26

 

Player SB
Reyes 47
Taveras 35
HanRam 32
Ellsbury 32
Bourne 31
Figgins 31
Pierre 31
Roberts 29
Crawford 29
Suzuki 28

 

Player wOBA
Pujols .438
Jones .412
Ortiz .407
Arod .407
Holliday .403
Manny .401
Wright .400
Cabrera .400
Berkman .399
Howard .396

Star-divide

Player PA
Reyes 581
Suzuki 574
Sizemore 572
Wright 567
Cabrera 565
Pedroia 563
Morneau 556
Young 554
Beltran 553
Utley 553
Iwamura 553
Ibanez 553

 

Player AVG
Pujos .342
Jones .322
Holliday .320
Mauer .320
Cabrera .319
Guerrero .316
Suzuki .315
Wright .312
Ordonez .311
HanRam .311
Pedroia .311

 

Player OBP
Bonds .470
Pujols .467
Jones .428
Helton .426
Manny .419
Mauer .419
Njohnson .418
Berkman .417
Ortiz .413
Cabrera .410

 

Player SLG
Pujols .625
Howard .582
Ortiz .572
Arod .564
Braun .562
Cabrera .555
Holliday .552
Jones .551
Manny .548
Fielder .544

 

Player ISO
Howard .301
Pujols .283
Ortiz .279
Dunn .268
Arod .265
Braun .261
Pena .258
Fielder .255
Soriano .249

 

Player RAA
Pujols 33
Cabrera 18
Howard 18
Holliday 18
Utley 17
Arod 17
HanRam 17
Jones 16
Wright 16
Fielder 15
Manny 15
Braun 15

 

Pitcher IP
Sabathia 187
Halladay 182
Johan 177
Hamels 174
Webb 173
Lincecum 172
Lee 172
Burnett 170
Ervin 170

 

Pitcher ERA
Lincecum 3.61
Harden 3.70
Halladay 3.78
Sabathia 3.79
Peavy 3.80
Webb 3.81
Johan 3.82
Hamels 3.91
Cain 3.94
Haren 3.97

 

Pitcher SO
Lincecum 165
Sabathia 160
Johan 154
Burnett 150
Hamels 148
Ervin 141
Cain 139
Haren 139
Vazquez 139
Billingsley 137

0 recs  |  Comment 25 comments |

Story-email Email Printer Print

More from Beyond the Box Score

Who's Left?: Left and Right Field

Jan 2009 by xanthan - 6 comments

Comments

Display:

Interesting

Does Marcel factor in the probability of a player strike/lockout? Most of these numbers seem low by 20% or so.

by Eric Simon on Nov 8, 2008 12:36 PM EST reply actions   0 recs

In other words

these are kind of useless.

by Daniel Berlyn on Nov 8, 2008 12:52 PM EST up reply actions   0 recs

In other words, they are heavily regressed.

Marcel isn’t all that smart about playing time. It doesn’t know when players switch teams, get injured, get benched, or whatever. All it knows is how much a player has played in the past. And given only that input, it maximizes its accuracy by heavily regressing playing time.

When I have time later I’ll find last year’s Marcels and post the leaders for PAs, HRs, etc. I think we’d all be surprised how much the sum of predicted totals of the top ten leaders in PAs matches their actual total of PAs (probably 2/3 of them would have 100 PAs higher with 1/3 at 200 PAs lower or something like that).

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Nov 8, 2008 3:39 PM EST up reply actions   0 recs

Interesting. From Tango's 2008 Marcels, here are the projected PA leaders for this past season:

665 Rollins
653 Reyes
649 Sizemore
643 Suzuki
640 Pierre
632 Uggla
629 Zimmerman
628 Jeter
623 Holliday
623 Gonzalez

And the 2008 projected IP leaders:

202 Webb
200 Sabathia
199 Harang
195 Halladay
194 Lackey
194 Haren
194 Hudson
194 Blanton
193 Santana
192 Peavy

Those are MUCH higher than Colin’s Marcels’ PA and IP leaders for 2009, so I’m suspecting there’s something weird with what Colin did. I’ll bring it to his attention.

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Nov 9, 2008 11:24 AM EST up reply actions   0 recs

There's a bug in the PA/IP code.

Or rather there was – it was fixed in a later version and I simply didn’t publish those yet. I’m still revising the Marcels code that I have – all rate stats should be unaffected. I’ve been putting off publishing revisions until the Baseball Databank gets released, but since these are getting as much play as they are I’ll see about publishing the revised version later this afternoon.

by cwyers on Nov 9, 2008 2:03 PM EST up reply actions   0 recs

Thanks Colin, I'd love to use revides IP numbers for an article tomorrow.

Do you mind letting us know when they’re updated?

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Nov 9, 2008 3:40 PM EST up reply actions   0 recs

Bonds atop the OBP leaderboard? someone sign that guy, quick!

'That's something we do...thirteen hits and not score'-Terrence Long

by DyeLongJustice on Nov 8, 2008 2:16 PM EST reply actions   0 recs

What Terrence Long Said

Though how he does that without appearing on any other list is left as an exercise.

by klhoughton on Nov 8, 2008 11:33 PM EST up reply actions   0 recs

Since Bonds missed all of 2008...

…he is further regressed to the mean than the other Big Damn Sluggers on the list. I don’t know if that’s necessarily correct in Bonds’ case, but it’s such an unusual situation that I don’t think it matters for the vast majority of players.

by cwyers on Nov 9, 2008 2:01 PM EST up reply actions   0 recs

You just mean further regressed because he doesn't have 2008 data to "pull" his projection up from league average, right?

His great seasons are weighted 4 and 3 without any PAs weighted at 5, right?

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Nov 9, 2008 3:41 PM EST up reply actions   0 recs

Right.

Here’s how Marcels regression works:

x/(x+214)

Where x is number of PAs in sample (in this case weighted PAs). Instead of 5/4/3 weights I use equivelent fractions, like 1/.8/.6. Keeps things a lot tidier for me. (It all works out the same in the wash.)

So for a player with 300 PAs in sample, you get:

300/(300+214) = .58

Which means that he gets regressed to the mean 42% (1-.58). For a player with 1600 in-sample PAs (a full-time starter, in other words), you get:

1600/(1600+214) = .88

So only 22% regression to the mean.

That is, in fact, reflected the in “R” column in the spreadsheet – that lets you know exactly how far each player’s stats were regressed. Bonds missed a full season, and the one that’s weighted the most heavily, so he gets regressed more than someone like, say, Pujols.

Marcels more drastically regresses playing time, and it only uses two years of data, like so:

.5 * PA1 + .1* PA2 + 200

Where PA1 is PAs in 2008 and PA2 is PAs in 2007. That really lowers the playing time forecast for someone who didn’t play at all in 2008.

[As a final aside – any forecasting system is likely to do better with projecting rates of performance than projecting playing time. The optimal approach involves combining the rate stats from a projection system with a better projection of playing time based upon depth charts and such.]

by cwyers on Nov 9, 2008 9:26 PM EST up reply actions   0 recs

That's phenomenal, thanks Colin.

Next thing you know, everyone will be spitting out projections willy-nilly. We can only hope…

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Nov 9, 2008 9:58 PM EST up reply actions   0 recs

Is it easy enough to download the spreadsheets from

“baseball databank?”

BTW, to either of you (or anyone) know of or have a generic spreadsheet “plug in” where you can simply input the last three seasons of data and the players age and get a quasi-Marcels projection? The Marcels “in-season projector” from THT doesn’t quite do the trick at this point, and the “CAIRO” one I found includes all sorts of other data, unless I’m doing it wrong.

Or, I could just go with my generic 5-4-3 or 5-4-3-2 (with some league average regression optional) pseudo-projections using linear weights type stats (since I’m going for WAR anyway, and they wrap up rate and playing time data all in one — which works OK for veterans who have been playing full-time, not so well for part-timers or younger guys).

OMG Banny. FWIW I am only crdtng u w/3 runs allwd bc of DDJ OMFG

by devil_fingers on Nov 9, 2008 10:24 PM EST up reply actions   0 recs

I'm not familiar with BDB, but I assume it's straight forward.

The in-season projector will get the rate stats correct, but the playing time needs to be done separately. As Colin showed, it’s pretty straight forward.

A basic plug and chug projector would be nifty, but why not just use Colin’s Marcels which are done for you already?

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Nov 9, 2008 10:39 PM EST up reply actions   0 recs

Yeah, i got'em again

I would just like to be able to do it myself, for fun.

Nice job, though, Colin. I take it that you generate the RAA stuff by getting the average from the projections and going from there?

OMG Banny. FWIW I am only crdtng u w/3 runs allwd bc of DDJ OMFG

by devil_fingers on Nov 9, 2008 10:59 PM EST up reply actions   0 recs

Sorry, another braa question

It’s just

(wOBA – lgwOBA)*PA

Or is it

[(wOBA – lgwOBA)/1.15]*PA

and then to add in SB/CS to that figure, you do bRAA + SB*0.17 -CS*.033? Or have you already added SB/CS into the wOBA figure?

OMG Banny. FWIW I am only crdtng u w/3 runs allwd bc of DDJ OMFG

by devil_fingers on Nov 9, 2008 11:04 PM EST up reply actions   0 recs

sorrry

I take it that “RAA” and “SB_RAA” can be added together to get the total projected “linear weight” run production of the player?

OMG Banny. FWIW I am only crdtng u w/3 runs allwd bc of DDJ OMFG

by devil_fingers on Nov 9, 2008 11:09 PM EST up reply actions   0 recs

If you’re using wOBA, [(wOBA – lgwOBA)/1.15]*PA is the correct way to figure RAA. I figured RAA (and SB_RAA) seperately, using my reference set of LWTS from 1993-2007.

by cwyers on Nov 10, 2008 10:58 AM EST up reply actions   0 recs

Sal's Marcels spreadsheet does it pretty much right...

…except for playing time. It’s an easy fix. Open up the hitter’s spreadsheet, go to Cell C13, and change the formula in there to:

=0.5*C12+0.1*C11+200

There’s no similar quick fix for the pitchers’ spreadsheet, unfortunately.

I learned much of what I needed to know for my Marcels from that spreadsheet. A person wanting to know more about projections systems could do worse than to poke around in that and try to figure out how it works.

The Baseball Databank itself is available either in CSV files or as a MySQL database. Excel can import CSV files.

My eventual plan is to (re)publish my Marcels code, along with documentation and a tutorial. That way, everyone has the full source to a basic projection system.

by cwyers on Nov 10, 2008 10:57 AM EST up reply actions   0 recs

awesome

that’s a great, great idea… thanks so much for all of your work on this. It really is a nice service to everyone, especially lazy dumbasses like me.

OMG Banny. FWIW I am only crdtng u w/3 runs allwd bc of DDJ OMFG

by devil_fingers on Nov 10, 2008 12:14 PM EST up reply actions   0 recs

English Translation: Assumes the Giants are sane

Though it seems even more unlikely that someone will overwork Sabathia on signing him to a long-term contract.

(Mets management, otoh, will abuse Johan. They’ll have to if Reyes is putting up those hitting numbers and fielding the way he appears to have this year.)

by klhoughton on Nov 8, 2008 11:35 PM EST up reply actions   0 recs

I'm motivated enough to write an article on projecting/regression, even though I'm not expert.

You cannot read that IP leader board as saying “no pitcher will throw 200IP in 2009.” You should read it as “No individual pitcher is expected to pitch 200IP in 2009.” SOMEONE (probably many pitchers) will do it, we just don’t know exactly who it is.

Here’s another example. Would you bet on the Miami Dolphins at 50/50 odds to win the SuperBowl. No. How about the Giants? Better idea, but no. The Patriots? Giants? Redskins? No, no, no. In fact, there’s no team with a better than 50% chance of winning the SuperBowl. Does that mean you think NOBODY is going to win the SuperBowl? Uh, obviously not.

(To be technical, the Marcel projections are means, not medians, but the point gets across.)

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Nov 9, 2008 9:24 AM EST up reply actions   0 recs

Also, again, the Marcels don't know ANYTHING about playing time other than historical playing time.

They don’t know about injuries, organizational philosophies, pitch counts, rotation depth, bullpen depth, etc. Therefore, to maximize accuracy across the board, heavy regression is applied. Other systems, especially ones who assign playing time “manually” do much better, and probably WOULD project Sabathia at 200 IP. But you’d also be surprised how heavily regress (i.e. conservative) those projections are. Just not quite as much.

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Nov 9, 2008 9:26 AM EST up reply actions   0 recs

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Why be a member?
Start posting on Beyond the Box Score »

Join SB Nation and dive into communities focused on all your favorite teams.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

770insig_small
BtB's "Ball On A Budget" Fantasy League - Discuss Participants, Payrolls and Position Eligibility

Recent FanPosts

Ds9_small
good graphing program?
Small
Predicting HR/FB Rates
Leopold_butter_scotch_southpark_small
Troy Tulowitzki vs Ryan Braun
Small
Pitchers batted ball observations
Small
Eric Byrnes: A player worth a look?
Small
Valverde Is Charging Detroit Double
Mukuro_small
Another question: About power rankings
Small
Why You Shouldn't Trade for Arroyo
Jinaz-reds-avatar_small
Last Call for BtB Sabermetric Writing Award Nominations

+ New FanPost All FanPosts >

FanShots

Quick hits of video, photos, quotes, chats, links and lists that you find around the web.

Recent FanShots

Can you spot the five guys NOT in the Hall of Fame? It's easy, I separated 'em for ya. :)

Here's the full post about guys elected to the Hall of Fame primarily for their defense. Guys like Brooks Robinson, Ozzie Smith, Bobby Wallace, John Ward, and Bid McPhee stand up as definite HOFers, with many others worthy of debate (and a couple aren't worth any debate at all).

I almost made it through the whole post without comparing some HOFers to our friends Bill Dahlen, Lou Whitaker, Bobby Grich, Alan Trammell, and Ron Santo. Almost.
If you care about newspaper coverage of MLB, read this post
Visualizing the Difference Between Offensive and Defensive Value for Catchers
First B-Pro and now ESPN. Tommy, you're growing up so fast
THT - Advancing by ground
Negro League Museum Close to Folding
It is a capital mistake to theorize before one has data. Insensibly one...
Ranking Minor League Systems Using Victors Wang's Prospect Valuations
Pitch f/x on Ricky Nolasco Stretch vs. Windup again
Veron Wells the artist.  I never knew.

http://www.vwellsart.com/

+ New FanShot All FanShots >

BtB on Twitter

Main Feed: @BtBScore

Jeff: @jeffwzimmerman
Steve: @steve_sommer
Sky: @BtB_Sky
Dan: @dturkenk
Harry: @harrypav
Jinaz: @jinazreds
Jack: @jh_moore
Erik: @Erik_Manning
Tommy R: @trancel
Justin: @justinbopp

Subscribe to BtB via Email

Enter your email address:

Delivered by FeedBurner

BtB Goes Social


Managers

Wbc_029_small Jeff Sullivan

Editors

Rawlings_baseball_bigger_small Dan Turkenkopf

Limes_125_small Sky Kalkman

770insig_small Jeff Zimmerman (TucsonRoyal)

Aviles_small Justin Bopp

Authors

Roots_game_small R.J. Anderson

Jinaz-reds-avatar_small JinAZ

Face_small Harry Pavlidis

1753738656_110919ebe9_o_small vivaelpujols

Ozzie_small erik

Raysring1_small Tommy Rancel

Redcap_small SFiercex4

St_louis_cardinals_ce1141_003263_small stevesommer05

Paige_small Satchel Price