Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: This Week In GIFs

Regress, Regress I Say! (Complete with In-Season Batter Regression Tool)

5661573633_4080d996f1_b_medium

(Click here to enlarge)

(Thanks to J-Doug and commenter hairball for helping me find an error in the calculator. Data and post have been updated accordingly)

So much is driven early on by small sample sizes that it can be easy to get overly excited about a batters hot or cold start. After only about 15-20 games, all sorts of weird stat lines can pop up. Maybe a player's performance is a reflection of a change in approach, leading to a change in performance (e.g. Jose Bautista). Or maybe it's just a reflection of random chance.

To gain some perspective on the early returns this season I decided to regress all batters with >=40 plate appearances by their 3-year average batting average on balls in play (BABIP) and their 3-year average HR/FB rate. The chart above presents the top-15 and bottom-15 batters in terms of the difference between their actual wOBA this year and what we would expect given their 3-year averages in BABIP and HR/FB.

Not surprisingly, we see Russell Martin high up on the list in terms of beneficial difference with a wOBA .148 points higher than his regressed wOBA. Pat Burrell is off to a .391 start, but if we regress him he may be more of a .290 wOBA hitter given his BABIP and HR/FB rates.

(More analysis and calculator below)

Star-divide

Brad Hawpe, who has been off to just a miserable start, comes out with a respectable .321 wOBA--a difference of -.169 (second largest negative difference of all players). Hamley Ramirez comes in fourth in terms of least beneficial difference with an expected .399 wOBA (-.142 difference).

Now, just because regressing a batter based on their 3-year averages show them to be very high or low relative to what they've shown so far we can't just assume it's luck. For example, Pablo Sandoval should technically be in the .319 wOBA range, but we all know that Pablo this year is drastically different from a year ago. Will he end the year with a .400 wOBA? That I can't say, but we have to interpret the data in context and it's pretty likely he'll be a better than league-average hitter by the end (say, .340-.360).

I put together an In-Season Batter Regression Calculator for all to use throughout the season (you can find it here). It will likely be more useful early on until player performance stabilizes, but it will work at any point in the season.

5658223393_71153685e8_b_medium

(Click to enlarge)

Simply type in the first and last name of the player (all batters with >=40 plate appearances where I have 3-year average data are included) as well as a few bits of data (basically FanGraphs standard dashboard plus FB%). The calculator will compute their adjusted actual wOBA (just using the coefficients I used for consistency's sake) as well as what we would expect based on BABIP and HR/FB over the past three seasons.

The tool simply takes what a hitter has done to this point and recalculates their performance based on how many non-HR's they should have (based on 3YR BABIP) as well as home runs based on 3YR average HR/FB ratios. It doesn't necessarily predict where a player will finish the year, but it does give some sense of how much a player's performance is out of whack relative to those averages.

I'll be doing more with this tool in later posts. For now, let me know what you think and any suggested tweaks.

Comment 27 comments  |  0 recs  | 

Do you like this story?

Comments

Display:

If Sandoval's expected wOBA is .272

Then your numbers are wrong.

"Today I flew the most poorly dressed bad-ass that has ever entered my jet. And he borrowed my pen to do a cross word puzzle." - robotsapproach on Brian Wilson.

by hairball on Apr 27, 2011 12:29 PM EDT reply actions  

For reference purposes (from FG)

Sandoval’s career wOBA: .359
Sandoval’s wOBA in his “disaster” 2010: .314
Sandoval’s ZiPS® for 2011: .357
Sandoval’s ZiPS(U) for 2011: .364

"Today I flew the most poorly dressed bad-ass that has ever entered my jet. And he borrowed my pen to do a cross word puzzle." - robotsapproach on Brian Wilson.

by hairball on Apr 27, 2011 12:36 PM EDT up reply actions  

It's a straight calculation based on 3YR average BABIP and HR/FB

And it’s best on how many flyballs he’s hit this year so far and the distribution of singls, doubles, and triples to date.

And as I mentioned, it doesn’t mean he should have a .272 wOBA, just that if you apply his 3-year average BABIP and HR/FB to his plate appearances so far it would look like this. But it doesn’t provide any context, which is why he likely isn’t actually a .272 wOBA guy overall.

by Bill Petti on Apr 27, 2011 12:40 PM EDT up reply actions  

.272 is nearly .100 points off from ZiPS

and is .42 belos the lowest wOBA he’s ever posted. Seems like useless, faulty data in predicting regression. Either that, or use different terminology, because your “wOBA” and the one ZiPS and FG are using are not apples to apples.

"Today I flew the most poorly dressed bad-ass that has ever entered my jet. And he borrowed my pen to do a cross word puzzle." - robotsapproach on Brian Wilson.

by hairball on Apr 27, 2011 12:43 PM EDT up reply actions  

*below

"Today I flew the most poorly dressed bad-ass that has ever entered my jet. And he borrowed my pen to do a cross word puzzle." - robotsapproach on Brian Wilson.

by hairball on Apr 27, 2011 12:43 PM EDT up reply actions  

Also, meant .042

"Today I flew the most poorly dressed bad-ass that has ever entered my jet. And he borrowed my pen to do a cross word puzzle." - robotsapproach on Brian Wilson.

by hairball on Apr 27, 2011 1:24 PM EDT up reply actions  

It's just a different method for getting a handle on what a guy might look like right now if certain rates are different

And wOBA is calculated the same except that the coefficients are a bit different, but not so much you’d notice.

by Bill Petti on Apr 27, 2011 12:46 PM EDT up reply actions  

.100 points of predicted wOBA difference is QUITE noticable

"Today I flew the most poorly dressed bad-ass that has ever entered my jet. And he borrowed my pen to do a cross word puzzle." - robotsapproach on Brian Wilson.

by hairball on Apr 27, 2011 12:46 PM EDT up reply actions  

He’s only played 2 full seasons though. Is your projection assuming that his production will keep plummeting like it did last year?

by boonitez on Apr 27, 2011 3:49 PM EDT up reply actions  

Nope

Since he only had two seasons his BABIP and HR/FB average are just based on those two years (this is a straightforward average calculated by FanGraphs—no weighting, etc).

That being said, that is why I mentioned context above—his numbers will likely be better if for no other reason then last year was likely a fluke compared to his true talent.

It’s also why this isn’t meant to be a fully predictive tool—just offers some context based on what hitters’ peripheral numbers have been historically and what they are to this point this year.

by Bill Petti on Apr 27, 2011 3:54 PM EDT up reply actions  

To be more specific

What’s driving it right now is that his 3-year HR/FB ratio is about 10% but so far this year he’s at 21%. That’s a difference of about 3 home runs. His BABIP is 15 points higher, but generally he’s had a high BABIP (.325).

by Bill Petti on Apr 27, 2011 12:44 PM EDT up reply actions  

You're messing up somewhere
Pablo Sandoval should technically be in the .270 wOBA range

That’s just wrong, and should be taken out of the article.

"Today I flew the most poorly dressed bad-ass that has ever entered my jet. And he borrowed my pen to do a cross word puzzle." - robotsapproach on Brian Wilson.

by hairball on Apr 27, 2011 12:46 PM EDT up reply actions  

Think you are taking this a bit too much to heart

Just based on the straight calculation, yes, that’s where he should be. Like I said, his HR/FB rate seems to be driving it right now.

by Bill Petti on Apr 27, 2011 12:48 PM EDT up reply actions  

Then the straight calculation you are using is not wOBA-related

There is no realistic scenario where a sub-.300 wOBA is “where he should be.” This is not me being a Pablo fan or a Giants homer. This is a stats discussion where I’m wanting to correct a glaring error.

You are using your numbers in comparison to wOBA. If your numbers are not the same wOBA formula, then it’s not apples to apples. If your numbers ARE essentially the same, then you calculated something wrong. It’s simple.

"Today I flew the most poorly dressed bad-ass that has ever entered my jet. And he borrowed my pen to do a cross word puzzle." - robotsapproach on Brian Wilson.

by hairball on Apr 27, 2011 12:53 PM EDT up reply actions  

It is the same wOBA formula

What’s different is how many singles, doubles, triples, and home runs he has versus how many he likely would have given his 3-year BABIP and 3-year HR/FB rate at this point in the season.

by Bill Petti on Apr 27, 2011 12:55 PM EDT up reply actions  

Just please do me a favor and check your numbers that you used on all of those with him

"Today I flew the most poorly dressed bad-ass that has ever entered my jet. And he borrowed my pen to do a cross word puzzle." - robotsapproach on Brian Wilson.

by hairball on Apr 27, 2011 12:58 PM EDT up reply actions  

yes, please, i know it would help make hairball's year

"If we hit that bull's eye, the rest of the dominoes will fall like a house of cards. Checkmate"

by Ivdown on Apr 30, 2011 3:34 AM EDT up reply actions  

How about you calm down a little

And let him write his articles however he pleases with his methods. For reference how would you like it if someone walked up to you at work who didn’t know you and said to change something on your work because they thought it was wrong. It would be stupid.

"If we hit that bull's eye, the rest of the dominoes will fall like a house of cards. Checkmate"

by Ivdown on Apr 30, 2011 3:33 AM EDT up reply actions  

I have to agree with hairball

In that I don’t quite understand how his expected wOBA could be .270 regardless of how you calculate it. His career wOBA is .359. League average is usually around .335. His regressed wOBA should be in the low .340s. However you calculate it, how you come up with a difference of .07?

Blogger and Editor, Rational Pastime Blog. Twitter: @RationalPastime.

by J-Doug on Apr 27, 2011 1:10 PM EDT reply actions  

J-Doug and hairball: Sure thing, happy to explain

This assumes the same distribution of non-HR’s per hit as he has so far this year.

Actual (through Monday night):
H: 22
1B: 15
2B: 2
3B: 0
HR: 5

Estimated based on 3-year BABIP (.325) and HR/FB (10%) (from FanGraphs):
H: 15
1B: 11
2B: 2
3B: 0
HR: 2

The difference in non-HRs make sense given his .340 vs. .325 BABIP as does the 5 vs. 2 HR’s.

This is just one way to approach it, though. I am sure there are others and, again, this isn’t necessarily predicting what the end of the year will look like but rather what we might expect the batter’s line to look like if we applied the 3YR BABIP and HR/FB to what the batter has done so far this year.

by Bill Petti on Apr 27, 2011 1:23 PM EDT up reply actions  

So, because his HR/FB rate is higher you're not counting 3 of those 5 HRs as expected hits?

It makes sense that you wouldn’t count them as HRs, but it doesn’t make sense you wouldn’t expect them to be hits. If your HR/FB rate goes down it could be due to homers turning into outs, but also doubles.

I can’t see any reason why regressing a .340 BABIP to .325 would take away 7 of 22 hits.

Blogger and Editor, Rational Pastime Blog. Twitter: @RationalPastime.

by J-Doug on Apr 27, 2011 1:29 PM EDT up reply actions  

For simplicity's sake, that's what I did

I suppose one way to handle it is to take half the difference in HR’s and treat them as hits, but that strikes me as just as bad or worse.

I did find an error—thanks to you both for pushing me—and with the adjustment Sandoval has an adjusted wOBA of .318.

by Bill Petti on Apr 27, 2011 1:42 PM EDT up reply actions  

Glad you're receptive to constructive criticism

Even if I disagree with the methodology

"Today I flew the most poorly dressed bad-ass that has ever entered my jet. And he borrowed my pen to do a cross word puzzle." - robotsapproach on Brian Wilson.

by hairball on Apr 27, 2011 1:45 PM EDT up reply actions  

In other words, I think you're discounting hitting events twice

By adjusting for both HR/FB and BABIP. I hope you’re not treating those two stats as independent of one another.

Blogger and Editor, Rational Pastime Blog. Twitter: @RationalPastime.

by J-Doug on Apr 27, 2011 1:31 PM EDT up reply actions  

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Why be a member?

Follow us on Facebook!

Follow us on Twitter!

SaberGraphics

Yahoo_full_count

MLB Daily Dish

Get the latest MLB Trade Rumors, Transactions, and News at MLB Daily Dish!


Managing Editor:

Jbopp-kc_small Justin Bopp

Columnists:

Adam_small adarowski

Dme_small Satchel Price

Closeup4_small J-Doug

Carlosicon_small Julian Levine

Billy_and_daddy_4th_of_july_small Bill Petti

Featuring:

Dayton_small Jeff Zimmerman

12475953_small Jacob Peterson

Recent_pic_pg_small Patrick Gordon

Btbpro_small Dave Gershman

Me_small Bryan Grosnick

229331_10150183361996591_674441590_6760167_6637860_n3_small Lewie Pollis

Img_3830_small David Fung

30472_1481067225243_1190689185_1381415_997334_n_small Glenn DuPaul

1mnvxku7_small joshuaworn

Set_small MattFilippi18

Photo0011_small Nathaniel Stoltz