Beyond the Box Score: An SB Nation Community

Navigation: Jump to content areas:


Sports blogs for fans, by fans.
Around SBN: Cal RB Jahvid Best Seriously Injured, Carted Off Field

Some odd results from a multivariate regression

After I wrote this post on my blog regarding FIP and it's correlation to UZR, I got an idea to run a multivariate regression using each position independently (excluding C and P, since there's no UZR data for them), and I got some interesting results, and I'm not really sure if I can either (A) trust the data or (B) if I'm interpreting it correctly, so I thought I'd post it here.

Dependent Variable: TotalRunDiff (or TRD) = (IP/9) * (FIP - ERA)

This is the difference in earned runs projected by FIP and actual earned runs.

Independent variables: 1Buzr, 2Buzr, 3Buzr, SSuzr, LFuzr, CFuzr, RFuzr

The UZR for each team by position.

I input data for all 30 teams in 2008.   Here's the equation the regression analysis spit out:

TRD = .048 + 2.12*1Buzr + (-.10)*2Buzr + 1.60*3Buzr + .70*SSuzr + .02*LFuzr + 1.66*CFuzr + .60RFuzr

The correlation was pretty strong; r = .8063.

This seems to imply that the most important positions, in order, are 1B, CF, 3B, SS, RF, LF, and 2B, with good defense at 2B actually having a slightly negative effect on a team (which doesn't make any sense, but this is why I will run more regressions on other seasons besides 2008).

Just wondering if anybody had any input on this.

 

0 recs  |  Comment 18 comments

Story-email Email Printer Print

More from Beyond the Box Score

Graph of the Day: Range Factor

Aug 2009 by Jack Moore - 13 comments

Comments

Display:

The second base data could have been thrown off...

If teams with good 2B UZR in 2008 all had bad TRD, it could adversely affect your data. Of course, this would mean that most of the bad TRD teams had good UZR , and that good TRD teams had bad UZR. I don’t think it accurately reflects 2B’s impact on TRD.

A larger sample size could fix this problem.

by NoNameOnCard on Mar 4, 2009 2:54 PM EST reply actions   0 recs

Where did you get your positional team data?

The Fangraph leaderboards don’t divide up production between teams for a given player. Did you go team by team?

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Mar 4, 2009 2:59 PM EST reply actions   0 recs

I went to fangraphs => teams => fielders => position

I guess that might mess things up a little bit, but I don’t think it would be that large of an issue, would it?

---
Juuuust a bit outside!!
http://www.rightfieldbleachers.com

by Jack Moore on Mar 4, 2009 3:03 PM EST up reply actions   0 recs

So you downloaded seven sets of data?

Yeah, that should work just fine.

Multiple seasons would be good, obviously.

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Mar 4, 2009 3:27 PM EST up reply actions   0 recs

One thing to look at is the Standard Deviations between datasets (2B UZR, etc)

For example if the 2nd base the numbers are near 0 could mean that all team’s are getting the same play from their 2nd basemen, so it’s value doesn’t really matter.

I am wondering how well the positional S.D. correlates the positional multiplier in your equation.

by Jeff Zimmerman (TucsonRoyal) on Mar 4, 2009 3:34 PM EST reply actions   0 recs

SDs by position:

1B. .|..2B…|..3B|.SS..|….LF….|..CF..|…RF
5.68 | 8.48 | 9.8 | 9.83 | 11.22 | 10.61 | 15.63

---
Juuuust a bit outside!!
http://www.rightfieldbleachers.com

by Jack Moore on Mar 4, 2009 6:03 PM EST reply actions   0 recs

Just nothing there

I was also thinking the left side might be more important since most people are right handed, but it only applies to infield, but not to outfield.

I also looked at chances and that doesn’t help explain 2nd
“first and third baseman get around 1.5 chances per game, CF, 2B and SS, 2.5, and RF and LF, 2.0.” -MGL

If you remove 2nd base from the regression, what happens to r-sqared?

by Jeff Zimmerman (TucsonRoyal) on Mar 4, 2009 6:41 PM EST up reply actions   0 recs

Sometimes multiple regression is simply wrong.

It’s a very crude tool.

Some suggestions, however:

  • Use multiple years of data.
  • Look at all runs, not just earned runs.
  • Consider removing the constant.

by cwyers on Mar 4, 2009 8:43 PM EST reply actions   0 recs

p-values

You might consider double-checking the p-values of each individual term to see if any (i.e. 2B) could be considered insignificant contributors to the dependent variable. Just a thought…

by jrfischer on Mar 5, 2009 11:53 AM EST reply actions   0 recs

wait

did you run a regression on 7 independent variables using 30 observations?

by Matt Swartz on Mar 5, 2009 9:00 PM EST reply actions   0 recs

hello?

just to clarify, running a regression with seven independent variables and for only thirty observations is useless. if that’s what you did, it’s not even worth analyzing this. you might as well just summarize the individual players. for seven regressors, you should have 150-250 observations to be safe, i’d say. nothing much short of that.

by Matt Swartz on Mar 7, 2009 10:13 AM EST up reply actions   0 recs

So, 5-8 seasons' worth?

UZR’s available for seven at Fangraphs, right?

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Mar 7, 2009 10:32 AM EST up reply actions   0 recs

OK, when I get a chance I’ll add the other seasons. Might not be for a bit as I have a packed week coming up.

---
Juuuust a bit outside!!
http://www.rightfieldbleachers.com

by Jack Moore on Mar 7, 2009 1:31 PM EST up reply actions   0 recs

datum

It looks to me more like a measure of the variability in quality of the defender between teams at that position – rather than importance of the position.

1B can be slow-footed non-athletes, or extremely athletic. >>> 2B are remarkably similar, athletic, good glove – average number of plays handled.

Go away! Guys, you're gonna wake up my Mom!

by David Howards Legacy on Mar 6, 2009 4:15 PM EST reply actions   0 recs

Actually

Looking at the spread of UZR talent by position, OF is by far the most varying.

vivaelbeñsheets

by vivaelpujols on Mar 9, 2009 10:27 PM EDT up reply actions   0 recs

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Why be a member?
Start posting on Beyond the Box Score »

Join SB Nation and dive into communities focused on all your favorite teams.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Small
PZR-based Win Values 2001-2006

Recent FanPosts

Small
The "30 parks on a budget" challenge
Sunflower_small
World Series Simulation, Game #6
Small
JT20 Dynasty League
E52205a2_small
New Look
Sth70021_small
Exploring Hit f/x, Albeit Badly
Redcap_small
Ricky Nolasco: 4 WAR or 1 WAR?
Redcap_small
Apparently I can't do park adjustments
Small
Which tells us more: The last 7 at bats or 7 at bats against this pitcher?
Sleepy_jeff_small
How Efficient and Effective Were the Rockies in 2009?

+ New FanPost All FanPosts >

FanShots

Quick hits of video, photos, quotes, chats, links and lists that you find around the web.

Recent FanShots

The Mistake Lottery
On the Field, the Yankees Are The Team of the Decade. Off It? The Red Sox.
Tigers' all-time WAR leaders
Primer on Runs Created
How to improve basketball
LB Keith Bulluck uses a sabermetric analogy to explain the Titans' quarterback situation.
Alcides Escobar "abandoned his daughter before she was born"
UZR, Scouting, and the Fans
Not-So-Lousy Lineup Optimizer, Playoff Edition: New York Yankees

+ New FanShot All FanShots >

BtB on Twitter

Main Feed: @BtBScore

Tommy B: @tommy_bennett
Sky: @BtB_Sky
Dan: @dturkenk
Harry: @harrypav
Jinaz: @jinazreds
Jack: @jh_moore
Erik: @Erik_Manning
Tommy R: @trancel
Justin: @justinbopp

Subscribe to BtB via Email

Enter your email address:

Delivered by FeedBurner

Most Commented

Limes_125_small
Time To Move On
Nando_small
A Complete and Lenghty List of Baseball-Related Things Miguel Olivo is Good At
Aviles_small
Minnesota: Fielding TargetView Before & After JJ Hardy
770insig_small
Negative Team WAR - 2009 Edition
E52205a2_small
New Look

BtB Goes Social


Managers

Nando_small R.J. Anderson

Limes_125_small Sky Kalkman

E52205a2_small Tommy Bennett

Editors

Face_small Harry Pavlidis

Rawlings_baseball_bigger_small Dan Turkenkopf

770insig_small Jeff Zimmerman (TucsonRoyal)

Aviles_small Justin Bopp

Authors

Banny_small erik

Raysring1_small Tommy Rancel

Jinaz-reds-avatar_small JinAZ

Jmlogo_small Jack Moore

1753738656_110919ebe9_o_small vivaelpujols

1_small Graham

Baseball_small Mike Rogers

Redcap_small SFiercex4

Small Patrick Clark

Walter_album_small Walter Fulbright