After I wrote this post on my blog regarding FIP and it's correlation to UZR, I got an idea to run a multivariate regression using each position independently (excluding C and P, since there's no UZR data for them), and I got some interesting results, and I'm not really sure if I can either (A) trust the data or (B) if I'm interpreting it correctly, so I thought I'd post it here.
Dependent Variable: TotalRunDiff (or TRD) = (IP/9) * (FIP - ERA)
This is the difference in earned runs projected by FIP and actual earned runs.
Independent variables: 1Buzr, 2Buzr, 3Buzr, SSuzr, LFuzr, CFuzr, RFuzr
The UZR for each team by position.
I input data for all 30 teams in 2008. Here's the equation the regression analysis spit out:
TRD = .048 + 2.12*1Buzr + (-.10)*2Buzr + 1.60*3Buzr + .70*SSuzr + .02*LFuzr + 1.66*CFuzr + .60RFuzr
The correlation was pretty strong; r = .8063.
This seems to imply that the most important positions, in order, are 1B, CF, 3B, SS, RF, LF, and 2B, with good defense at 2B actually having a slightly negative effect on a team (which doesn't make any sense, but this is why I will run more regressions on other seasons besides 2008).
Just wondering if anybody had any input on this.