Beyond the Box Score: An SB Nation Community

Navigation: Jump to content areas:



Sports blogs for fans, by fans.
Around SBN: June USA Today / SB Nation Consensus MMA Rankings Released


Value of OBP and SLG by Lineup Position

There is an updated version of this at Value of OBP and SLG by Lineup Position, Part 2.

One question that often comes up is "what is the relative value of on-base percentage (OBP) and slugging percentage (SLG)?" Is OBP 50% more important than SLG? Or 60%? Or something else? A stat called OPS simply adds the two, giving them equal weight. But maybe the weight should not be equal. For example, here is the regression equation of team runs per game for the years 2001-03:

R/G = 17.11*OBP + 11.13*SLG - 5.66

This makes OBP about 53% more important than SLG, a fairly typical result. But it is possible that OBP might be more important for certain positions in the lineup, like the leadoff batter. And for SLG, it might be more important for the cleanup hitter. To check this out, I ran a regression in which team runs per game was the dependent variable (DV) and the OBP and SLG of each lineup slot as the independent variables (IVs). OBP1 means the OBP of the leadoff batter, SLG3 means the SLG of the third place hitter, etc. I used data from Retrosheet for the 1989-2002 seasons. Retrosheet shows the stats for each team by lineup position. Below are the coefficient values for the IVs.

There is quite a variance. A point of OBP is worth about .003 runs per game from the leadoff man (a .021 increase in the leadoff OBP would be about .063 runs more per game or 10 for a whole season, which usually means about 1 win) The value of OBP is much less for the number 8 man. For the leadoff man, OBP is three times as important as SLG. For the cleanup hitter, they are almost the same. So this analysis shows that the relative values of OBP and SLG could be different depending on the lineup position of the batter in question.

Mark Pankin has already looked at this issue using a tool called Markov Chains. He presented his results at the SABR convention in 2004. His study is on line at:

http://www.pankin.com/sabr34.pdf

There could be multicollinearity in my analysis, meaning that the coefficient estimates are not as reliable as they could be because IVs are highly correlated with each other. I discuss what I did to detect multicollinearity below. But if this were a problem, I tried a different, but similar model where the IVs would likely be less correlated with each other.

Each lineup slot had 3 variables: walk percentage, hit percentage and extra-base percentage. For walks, hits, and extra-bases, the denominator was plate appearances (PAs). This is a little different than comparing OBP and SLG since OBP has PAs as the denominator and SLG has ABs. Also, by using extra-bases, it is a little like isolated power. SLG is not always as good measure of power because a guy who hits a single drives up his SLG. Isolated power is SLG - AVG, or extra-bases divided by ABs. Of course, here, I am using PAs. H1 is the hit% of the leadoff man, W1 is the walk% of the leadoff man, XB1 is the extra-base% of the leadoff man, etc. Here are the coefficient estimates:

Again, there are some big differences. The value of a walk to the leadoff man is twice what it is for the number 6 man. The cleanup hitter has the highest extra-base value.

I did try some other variables. I had SBs and CS per game in the first model with OBP and SLG. Things were generally fine there except that in a couple of cases, the value of a CS was positive and in one case the value of a SB was negative. Why some lineup slots would have negative values for SBs or positive values for CS is not clear. I tried one regression with just the AL since they have the DH and a regular player bats ninth. The results seemed about the same. Email me if you want those.

Multicollinearity. In the first model with OBP and SLG, most of the correlations between the IVs were under .5. But some were higher and they were all the OBP and SLG for corresponding lineup positions. The correlation between OBP1 and SLG1 was .596. Those correlations ranged from .596 to .739, except for OBP9 and SLG9, which was very high, at .897. But in the second model, only one correlation between IVs was over .5 and that was H9 and XB9 at .648. The vast majority of the others were under .2.

Another way to check for multicollinearity is to run regressions in which one IV is a function of all of the other IVs. In the first model with OBP and SLG, the r-squared was generally in the .5-.6 range (that was 18 regressions). R-squared tells us how what percentage of the variation in the DV is explained by the model. There is a stat called the "variance inflation factor" or VIF. It is 1/(1 - r-squared). So if r-squared was .5, 1- .5 = .5. Then 1/.5 = 2. A couple of sources I looked at suggested that if the VIF is under 10, multicollinearity is not a problem. Most of these were about 2. One got close to 6 (that was SLG9). I did come across one source that said there is no rule about the value of VIF and multicollinearity.

For the second model, I only ran a couple of these regressions where one IV depended on all the others. The first one was W1 and the r-squared was only about .2. I tried XB9 (which corresponds a little to SLG9, the one that was closest to being a problem in the other model) and the r-squared was only about .4, which would mean a very low VIF of about 1.7.

Also, multicollinearity is supposed to be a problem where the standard errors of the coefficient estimates are high. This makes it hard for the estimates to be significant. But that was generally not the case here. One thing I don't know about is that there might be some kind of joint hypothesis about the VIF. Maybe if you have a large number of IVs it only takes a certain number to have a VIF over 2 or something like that for there to be a problem.

0 recs | Comment 10 comments

Story-email Email Printer Print

Comments

Display:

Additional Comments
Comments can be found at Baseball Think Factory

As always, thanks to Repoz for the link.

"I don't set the rosters, I just make fun of the guy who does" - Rob Neyer

by Marc Normandin on Feb 13, 2006 12:56 PM EST reply reply actions actions   0 recs

Interesting
So if the leadoff hitter's OBP is more important than his slugging, did the Red Sox make a mistake by using Damon as the leadoff man with his homerun power and league average-ish OBP? I'm not sure there was another viable option in the lineup, but work with the scenario.
"I don't set the rosters, I just make fun of the guy who does" - Rob Neyer

by Marc Normandin on Feb 13, 2006 12:57 PM EST reply reply actions actions   0 recs

Not as much as they are about to
assuming the media gets its way with batting Crisp leadoff.

Loretta and Youk would make a much better 1/2 combo than Crisp and either.

The Sox biggest mistake last year was batting Renteria, with the lowest OBP on the team and one of the highest GIDP, second. Damon, while not one of the team leaders in OBP, was among the lowest regulars in SP, so it wasn't much of a waste there. Having a strong 1-9 lineup I expect also increases the value of SP from the #1 slot.

by cdamon on Feb 13, 2006 2:25 PM EST to parent up reply reply actions actions   0 recs

Damon
You raise an iteresting point. To whatever extent the values of OBP and SLG that this method foudn are true, can they be used to improve actual lineups? I don't know off the top of my head. Maybe it would be simple, like just finding which guy is the best in each slot based on their OBP and SLG. Plug the numbers in for the leadoff spot. The guy who comes out highest should bat first. But what if he also comes out highest at another position?

Or you could check each guy and see what his best spot is? But what if two guys both have cleanup as their best slot? Maybe this would have to be done by trial and error after some initial calculations. Or maybe their is some kind program or equation or algorithm that would do it. I certainly don't know right now. I'll have to think about it.

by Cyril Morong on Feb 13, 2006 5:10 PM EST to parent up reply reply actions actions   0 recs

straightforward optimization problem
Each player has a value for each lineup spot. Even the brute force solution for this only has 9! lineup possibilities to check, meaning it can be done on a computer in less than a second.

by cdamon on Feb 13, 2006 9:09 PM EST to parent up reply reply actions actions   0 recs

Maybe...
Sal can run his simulator (cough, cough) using some of the options. Unless he's too busy with academia, which is entirely possible.
"I don't set the rosters, I just make fun of the guy who does" - Rob Neyer

by Marc Normandin on Feb 13, 2006 10:58 PM EST to parent up reply reply actions actions   0 recs

straightforward optimization problem
Is that something that needs to be programmed or can it be set up in a spreadsheet?

by Cyril Morong on Feb 13, 2006 11:05 PM EST to parent up reply reply actions actions   0 recs

Programmed
unless someone who knows Excel much better than I knows some trick.

I can probably write you a tool to do it if you are seriously interested and don't have the expertise.

by cdamon on Feb 14, 2006 9:06 AM EST to parent up reply reply actions actions   0 recs

I'll see what my
sim can do, but the generally accepted result, and the one my prelim results suggest, is that batting order doesn't matter all that much.

by salb918 on Feb 16, 2006 12:05 AM EST to parent up reply reply actions actions   0 recs

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Start posting on Beyond the Box Score »

Join SB Nation and dive into communities focused on all your favorite teams.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Sleepy_jeff_small
WAR Lords of the Diamond (Pitchers)

Recent FanPosts

Small
SLG and Speed
Small
Interleague Attendance Nonsense
Limes_125_small
A Note About Becoming a BtB Author: Contributing to the Community Helps
Small
Is Adrian Beltre underrated...?
Limes_125_small
How Do You Like the New Daily Link Roundup Posts?
Stlouiscardinals_small
Depth Charts Help
Zorilla_small
How Do You Measure a Pitching Coach?
Limes_125_small
Looking For SQL & Tech Geek Help For Collaborative Projects
Small
When do MLBers get paid...?
Limes_125_small
Help Me Expand Who I Follow On Google Reader

Post_icon New FanPost All FanPosts Carrot-mini

FanShots

Quick hits of video, photos, quotes, chats, links and lists that you find around the web.

Recent FanShots

There's your human element.  Why, when the technology is readily available, are humans still calling balls and strikes?
Fire Jim Leyland: Fu-Te Ni Follow Up; Concern Over Big Three?
Rany Gets Banned By the Royals
Contract Retrospective: Vernon Wells' 7-year, $126 Million Contract
The Rockets are innovative
Flip Flop Fly Ball
Yanks Considered Trading Rivera For Wells In '95: MLB Rumors - MLBTradeRumors.com
Bullpen Usuage Charts for Last 5 days
MiLB Game of the Week
Drive Mechanics Looks at Chris Perez's Mechanics and Pitch F/X numbers

Post_icon New FanShot All FanShots Carrot-mini

Most Commented

Subscribe to BtB via Email

Enter your email address:

Delivered by FeedBurner

BtB Goes Social

BtB on Facebook

BtB_Sky on Twitter


Managers

Mos-def-the-ecstatic_small R.J. Anderson

Limes_125_small Sky Kalkman

Editors

Rawlings_baseball_bigger_small Dan Turkenkopf

Face_small Harry Pavlidis

770insig_small Jeff Zimmerman (TucsonRoyal)

Rickstache_small erik

Authors

Jinaz-reds-avatar_small JinAZ

Hms_surprise_small Graham

Wisc19cropped2_small jhmoore

Raysring1_small Tommy Rancel

E52205a2_small tbsmkdn

Official Partner of Yahoo! Sports