Beyond the Box Score: An SB Nation Community

Navigation: Jump to content areas:


Sports blogs for fans, by fans.
Around SBN: Spencer Hall's Sports Meme Power Rankings

Value of OBP and SLG by Lineup Position, Part 2

UPDATE ON FEB. 22-I CAN SEND MY DATA TO ANYONE WHO WANTS IT. JUST SEND ME AN EMAIL (cyrilmorong@sbcglobal.net).

Last week I posted some regression results in which team runs per game was the dependent variable (DV) and the OBP and SLG of each lineup postion were the independent variables (IVs). In some cases, the coefficient values were very different. I only looked at teams from 1989-2002. Retrosheet has how each team's lineup position did from 1959-2004 (but not the NL in 1959). So I went back and used all of the data that they have.

Here are the results:

Again, the coefficient values are not the same and vary quite a bit. I also ran the regression using only teams that had the DH (yes, sometimes the NL gets to use the DH-I mean AL teams only from 1973-2004). Here are those results:

You can judge for yourself if there are any differences when using only DH teams (where a real hitter bats 9th and not a pitcher).

Then I went back to all teams and put it SB and CS per game. Here are the results:

I know, in some cases things don't make sense. We see some negative values for SBs and positive values for CSs. I mentioned this last week. But I just wanted people to get a chance to look at this.

Then I did the SB/CS regression for DH teams only. Here are the results:

Again, judge for yourself if the differences are meaningful.

Now there could be collinearity between the IVs. I discussed this a little last week. I did not run any test yet for it this time. If I do, I will update this story. I ran a regression with some different variables to avoid or lessen this problem. Each lineup slot had 3 variables: walk percentage, hit percentage and extra-base percentage. For walks, hits, and extra-bases, the denominator was plate appearances (PAs). This is a little different than comparing OBP and SLG since OBP has PAs as the denominator and SLG has ABs. Also, by using extra-bases, it is a little like isolated power. SLG is not always as good measure of power because a guy who hits a single drives up his SLG. Isolated power is SLG - AVG, or extra-bases divided by ABs. Of course, here, I am using PAs. H1 is the hit% of the leadoff man, W1 is the walk% of the leadoff man, XB1 is the extra-base% of the leadoff man, etc. Here are the coefficient estimates:

Then I added SB and CS in. Here are those results:

0 recs  |  Comment 16 comments

Story-email Email Printer Print

Comments

Display:

Hmm
As always, I can't tell if that is sarcasm or not.
"I don't set the rosters, I just make fun of the guy who does" - Rob Neyer

by Marc Normandin on Feb 19, 2006 10:11 PM EST reply actions   0 recs

My guess is sarcasm...
But I dunno.

by Dan Scotto on Feb 19, 2006 10:14 PM EST up reply actions   0 recs

Power problems (the stats kind)
This is really interesting stuff.
One concern: given your sample size (big) small variances may have a trendency to come out as more significant than they really are.  This does not appear to be a problem for your H, W, XB terms, but I would take a hard look at the SB and CS results.  
I'm just speculating, but maybe stolen bases shouldn't be assigned to batting order positions.  I see the theoretical basis for doing so in the H, W, XB case (if only because your leadoff guy is more likely to be followed by a power threat); do you have a reason to think that a stolen base by a given batting order position would be worth more?
Have you tried using a more aggregate SB term?

by sunandrain on Feb 20, 2006 11:11 AM EST reply actions   0 recs

SBs, etc
Thanks for your comments.

I only put the SB/CS results there in case anyone was interested. I really don't know what to make of them.

It is possible that a SB in front of a guy who gets alot of hits (or just more singles, which I did not include) is more valuable than a SB in front of a guy who walks alot. That is the theoretical justification. Maybe it is a weak one.

Can you give me an idea of an aggregate SB term to try? Should I just try total SB/CS for each team instead of each lineup slot? Or maybe for the top 3, the middle 3 and last three batters?

by Cyril Morong on Feb 20, 2006 11:39 AM EST up reply actions   0 recs

This may be a repost - if so, ignore
Well, hmm...
If I were to test your theory, I'd do it a little differently.  The question is whether an attempted steal changes the number of outs produced by the baserunner/batter combination, right?  We can assume that fewer outs = more runs.  Could your dataset isolate this kind of combination (OBP of batter at the time of attempted steal, and outcome)?  That would be kind of cool to know.
In terms of an aggregate, I had thought of team aggregate and top and bottom of lineup.  Thirds as you suggest make more sense.  Basically, this becomes a control variable for what you demonstrate with the other variables.
Incidentally, your blog is really great.  How long have you been doing this?

by sunandrain on Feb 20, 2006 1:02 PM EST up reply actions   0 recs

Stealing
Actually, this blog was started and is run by Marc Normandin. I agree he does a great job.

I am not trying to see whether whether an attempted steal changes the number of outs produced by the baserunner/batter combination. Some of my earlier posts touch on that. Three people who have done good work on this are Tom Ruane, Ted Turocy and Mark Pankin. Also, I think this issue is discussed in "The Book" that is coming out soon by Tangotiger and others (mine has not come yet). This data set will not  isolate this kind of combination (OBP of batter at the time of attempted steal, and outcome). But you should search for those three guys on Google to see what they have come up with.

by Cyril Morong on Feb 20, 2006 3:01 PM EST up reply actions   0 recs

The Book
I need to order a copy of that next paycheck. Thanks for the compliment Cy and Sunandrain. Always glad to see new readers and users around these parts.
"I don't set the rosters, I just make fun of the guy who does" - Rob Neyer

by Marc Normandin on Feb 20, 2006 3:08 PM EST up reply actions   0 recs

Quick request
Cy, I hate to send you back to look at more work, but what does it look like for just the past 25 years? I'm currently reading Dayn Perry's new book, and he analyzes the playoff teams from the past quarter century to do his analysis on what exactly makes a team great nowadays. Just thought I'd ask, to see how different the data may look from the other two presented samples.
"I don't set the rosters, I just make fun of the guy who does" - Rob Neyer

by Marc Normandin on Feb 20, 2006 2:59 PM EST reply actions   0 recs

Quick request
That probably should not take too long once I get home. Just a matter of sorting, copying, pasting and telling the program to run the regression.

What is this new book?

by Cyril Morong on Feb 20, 2006 3:02 PM EST up reply actions   0 recs

Re: book
"Winners: How Good Baseball Teams Become Great Ones (and It's Not the Way You Think)"

It has been really interesting thus far, and I'll most likely throw an advertisent on the sidebar for it when I am finished. Dayn has some interesting ideas that he backs up with plenty of stat analysis, and he's entertaining just speaking in terms of his writing style as well.

"I don't set the rosters, I just make fun of the guy who does" - Rob Neyer

by Marc Normandin on Feb 20, 2006 3:06 PM EST up reply actions   0 recs

Advertisement*
"I don't set the rosters, I just make fun of the guy who does" - Rob Neyer

by Marc Normandin on Feb 20, 2006 3:07 PM EST up reply actions   0 recs

DH SLG3?
Is the SLG3 of the DH-only leagues really only 0.83?  That seems intuitively wrong to me.

I ran this on the A's using my script over on Catfish Stew, and it produces a lineup that consistently puts Mark Kotsay batting third.  That doesn't seem right.

by kenarneson on Feb 21, 2006 4:39 PM EST reply actions   0 recs

This is pretty radical, actually.
If these numbers are correct, the #3 slot is less important than the #7 and #8 slots in the order when there's a DH.

Here are the types it suggests are ideal.  I'll add up the OBP and SLG scores, just as a quick and dirty way to compare each slot's relative importance using this data:

#1: High OBP, Low SLG (3.55)
#2: Mid OBP, High SLG (3.34)
#3: Mid OBP, Low SLG (2.48)
#4: High OBP, Mid SLG (3.38)
#5: Low OBP, High SLG (2.44)
#6: Low OBP, High SLG (2.41)
#7: High OBP, Low SLG (3.16)
#8: High OBP, Low SLG (3.06)
#9: Mid OBP, Low SLG (2.22)

by kenarneson on Feb 21, 2006 7:21 PM EST up reply actions   0 recs

DH SLG3?
Ken, you raise a good point and I don't really have an answer. I can run the numbers again when I get home. It does seem like a strange result. Maybe it is because the cleanup batter usually has a high SLG, too. So if the third guy hits a single instead of a double and the cleanup guy hits a HR, the third place guy scores no matter what. That's the best I can come up with right now.

My main concern was seeing if the values for OBP and SLG would differ with lineup position. I think this shows that they do even if what I found are not the "true" values. You might take a look at what Mark Pankin found on this using Markov chains.

by Cyril Morong on Feb 22, 2006 11:58 AM EST up reply actions   0 recs

DH SLG3?
I reran the numbers after I made some corrections in the data (some team runs per game figures were misaligned). The value of SLG for the third place hitter changed in the DH only cases. It went from about .8 to 1.3. Sorry for any trouble this caused.

by Cyril Morong on Feb 22, 2006 10:44 PM EST up reply actions   0 recs

Check your e-mail
"I don't set the rosters, I just make fun of the guy who does" - Rob Neyer

by Marc Normandin on Feb 25, 2006 1:24 PM EST reply actions   0 recs

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Why be a member?
Start posting on Beyond the Box Score »

Join SB Nation and dive into communities focused on all your favorite teams.

FanPosts

Community blog posts and discussion.

Recent FanPosts

Leopold_butter_scotch_southpark_small
Using the TVC
Small
Determining Batted Ball Rates using Pitch Type and Location
Small
a new xBABIP calculator
Img587561916661595
Top 15 high school MLB draft prospects
Small
PZR-based Win Values 2001-2006
Small
The "30 parks on a budget" challenge
Sunflower_small
World Series Simulation, Game #6
Small
JT20 Dynasty League
E52205a2_small
New Look
Sth70021_small
Exploring Hit f/x, Albeit Badly

+ New FanPost All FanPosts >

FanShots

Quick hits of video, photos, quotes, chats, links and lists that you find around the web.

Recent FanShots

Primer on BaseRuns
Cool Baseball Infographics
ESPN's Jerry Crasnick on defensive metrics
I’m also a follower, since Brian Bannister’s on our team, of sabermetric st...
Top Ten Baseball-Reference.com's Sponsorships
Primer on Linear Weights
JC Bradbury on "Hot Stove Myths"
Everyone Should Learn to Throw a Cutter
Criminals of WAR
Ten statisticians you should know about

+ New FanShot All FanShots >

BtB on Twitter

Main Feed: @BtBScore

Tommy B: @tommy_bennett
Sky: @BtB_Sky
Dan: @dturkenk
Harry: @harrypav
Jinaz: @jinazreds
Jack: @jh_moore
Erik: @Erik_Manning
Tommy R: @trancel
Justin: @justinbopp

Subscribe to BtB via Email

Enter your email address:

Delivered by FeedBurner

BtB Goes Social


Managers

Nando_small R.J. Anderson

Limes_125_small Sky Kalkman

E52205a2_small Tommy Bennett

Editors

Face_small Harry Pavlidis

Rawlings_baseball_bigger_small Dan Turkenkopf

770insig_small Jeff Zimmerman (TucsonRoyal)

Aviles_small Justin Bopp

Authors

Banny_small erik

Raysring1_small Tommy Rancel

Jinaz-reds-avatar_small JinAZ

Jmlogo_small Jack Moore

1753738656_110919ebe9_o_small vivaelpujols

1_small Graham

Baseball_small Mike Rogers

Redcap_small SFiercex4

Small Patrick Clark

Walter_album_small Walter Fulbright