Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: This Week In GIFs

Playing in Parks: Component Park Factors 2006-2010

Photo

Park Factors are numbers we use to understand, and sometimes also to adjust for, the effects of parks on hitter or pitcher statistics.  Much of the time, we tend to be interested in park factors at the level of runs per game: hitters parks will cause more runs per game than pitcher's parks, given the same players.  When we're just looking at player value, runs park factors really are the most important and useful.  They allow us to understand a player in the context of his run environment.

If we really want to understand a park, however--or how a hitter will fare in a given park--we need to look at component park factors.  They allow us to answer this question: what affect do parks have on individual events like singles, doubles, strikeouts, etc?  This offseason, I was tasked with writing an article on Great American Ball Park for a Reds annual that will be published this spring (by Maple Street Press--watch for it!), and so I decided to take my first stab at component factors.  The data the follow are the result.  To my knowledge, the only other comparable, updated source for these kinds of park factors is Statcorner.  I don't know how they calculate theirs, so they might be better or they might be worse--they do give you splits by LHB and RHB, which is helpful.  But I have fun doing things myself now and then.  So here they are:

Star-divide

The table is sortable if you click in the header, so you can look at park rankings by each of these park factors.

Team Years R/G PA/G SO/PA BB/PA 1B/BIP 2B/BIP 3B/BIP HR/BIP SB/G CS/G TB/BIP BIP/PA
COL 5 1.09 1.01 0.95 0.98 1.02 1.04 1.18 1.08 0.98 0.99 1.05 1.02
ARI 5 1.06 1.00 0.97 1.00 1.00 1.06 1.33 1.04 0.95 1.07 1.04 1.01
CHC 5 1.06 1.00 1.01 1.00 1.01 1.03 1.00 1.06 0.99 1.02 1.03 1.00
CHW 5 1.04 1.00 1.01 1.04 0.98 0.98 0.82 1.14 1.00 1.00 1.02 0.99
BOS 5 1.04 1.00 0.98 0.99 1.00 1.16 0.95 0.93 0.95 0.97 1.01 1.01
TEX 5 1.03 1.01 0.99 1.00 1.00 1.01 1.14 1.05 1.01 1.04 1.02 1.00
NYY 2 1.03 1.00 0.99 1.02 1.00 0.96 0.81 1.15 1.06 1.02 1.03 1.00
CIN 5 1.03 1.00 1.01 1.01 0.99 1.00 0.94 1.11 0.99 1.01 1.03 0.99
BAL 5 1.03 1.01 0.97 0.97 1.02 0.97 0.89 1.09 1.00 1.06 1.02 1.01
KCR 5 1.02 1.01 0.96 1.01 1.01 1.06 1.12 0.93 0.93 1.00 1.00 1.01
PHI 5 1.01 0.99 1.01 0.99 1.00 1.00 0.90 1.07 0.99 0.92 1.02 1.00
DET 5 1.01 1.00 0.97 0.98 1.00 0.97 1.08 0.99 0.97 0.97 1.00 1.01
FLA 5 1.01 1.01 1.05 1.04 1.01 1.03 1.06 0.98 0.99 1.10 1.01 0.98
WSN 3 1.00 1.01 0.98 0.98 0.99 1.00 0.99 0.98 0.98 1.02 0.99 1.01
SFG 5 1.00 1.00 0.98 0.98 1.01 1.02 1.08 0.93 1.06 1.00 0.99 1.01
TOR 5 1.00 1.00 1.03 1.00 0.97 1.01 1.19 1.08 0.99 0.90 1.02 0.99
LAA 5 0.99 1.00 0.99 0.98 1.02 0.99 0.84 0.96 0.99 0.98 0.99 1.00
PIT 5 0.99 1.01 0.96 0.97 1.00 1.02 0.92 0.91 0.95 1.06 0.98 1.01
MIL 5 0.99 1.00 1.03 1.03 0.98 1.01 0.97 1.04 1.00 1.04 1.00 0.99
CLE 5 0.98 1.00 1.01 1.02 1.01 1.00 0.87 0.94 1.03 1.04 0.98 1.00
ATL 5 0.98 1.00 1.02 1.01 1.01 0.96 0.99 0.98 1.04 1.03 0.99 0.99
MIN 1 0.98 0.99 0.96 1.05 1.01 1.05 1.08 0.83 0.92 0.93 0.97 1.01
HOU 5 0.98 1.00 1.03 0.99 0.99 0.99 1.07 1.06 0.99 0.96 1.01 0.99
LAD 5 0.98 1.00 1.02 0.99 1.01 0.98 0.76 1.01 1.13 1.00 0.99 1.00
STL 5 0.97 1.00 0.98 1.01 1.01 0.96 0.94 0.91 1.02 0.98 0.97 1.00
TBR 5 0.97 0.99 1.01 1.01 0.99 0.96 1.15 1.00 1.05 1.04 0.99 1.00
OAK 5 0.96 0.99 0.99 1.00 0.98 0.97 1.01 0.93 1.02 0.91 0.96 1.00
NYM 2 0.96 0.99 1.01 1.01 0.99 0.97 1.13 0.95 1.03 0.93 0.98 1.00
SEA 5 0.96 1.00 1.04 1.03 1.00 0.98 0.90 0.95 1.02 1.09 0.98 0.99
SDP 5 0.91 0.99 1.05 1.03 0.98 0.90 1.02 0.92 0.97 1.00 0.95 0.98

*BIP = Ball In Play, includes all balls hit into play INCLUDING home runs, or AB-K+SF
*BB = Non-intentional Bases on Balls

Some words on the methods and rationale...

These are calculated as Patriot describes in his post, minus the regression.  At its heart, the Runs park factor is basically just:

[Runs Per Game at Home] / [Runs Per Game in League as Whole]

The denominator is estimated primarily by runs per game in away games, though there's an adjustment that includes the home team as part of the league as a whole.  The raw ratios are also divided by two, so you can apply them to season data and not just home splits.  Again, I highly recommend Patriot's article for discussion of these issues.

The one place where I extend beyond Patriot's article here is that I'm doing park factors not just for runs, but also for most of the important events that occur in a ballgame: singles, doubles, homers, walks, strikeouts, etc.  One thing that you will note is that I don't always do everything "per game" as I do with runs.  The park factor for singles, for example, is based on singles per ball in play.  Why?  If we just did everything per game, we would allow other events to influence our estimate of how a park affects our focal event.  Say, for example, that we were looking at home runs in a park that is neutral for home runs, but is otherwise a hitter's park (permissive to singles, doubles, etc).  Because it's a hitter's park, there will be more plate appearances than average at home, as outs happen less often per PA.  And because of those extra PA's, you will get more home runs in the park--but it's not because of an effect on homers, per se, it's because you get more opportunities to hit one.  By looking at home runs per ball hit into play, I'm focusing specifically on the effects of a park on balls that are struck and hit "fair."  Ideally, I might use only air balls--and perhaps only air balls hit by a left-handed batter vs. a right-handed batter--but I'm not there yet.

So, if you were going to use these data on a player (e.g. to figure out how Adrian Beltre might hit in TEX vs. BOS), you'd first want to adjust PA's, then balls in play per PA, and then finally adjust home runs to ball in play.  Most of the time, it probably won't give you a different answer than home runs per game.  But sometimes it will. 

There are more complicated ways of calculating park factors, like that used by baseball-reference to calculate runs park factors.  Most of the time, I think this approach I'm using works fine.  The cases I worry most about are those in the NL West, where you have the most extreme pitcher park in the same division as two of the most extreme hitters' parks.  Thanks to the unbalanced schedule, my guess is that this causes SDP, COL, and ARI to look slightly more extreme than they need to be.  I don't know how important this is.

Finally, as I mentioned above, these data are not regressed.  This means that you should be much more skeptical of the park factors for Minnesota's Target Field than, for example, that for Cincinnati's Great American Ball Park.  Patriot used what seem like pretty arbitrary (though reasonable) values he got from MGL to regress runs park factors that.  You can apply those coefficients to my numbers above and get values that match his 2010 factors exactly (I've done this)--and feel free to do it, folks.  But I decided not to do that to these data.  In a future post, my plan is to look at year to year correlations (and hopefully intra-class correlation coefficients if I can get access to SAS again, or figure out how to do it in R) for each event.  This will hopefully provide more useful data to help us to understand how much to regress each component.  I'm sure that some park factors (triples, for example) are more volatile from year to year than others (PA/G, maybe?), and so different coefficients would be needed.

But for now, that's a wrap!  Hope you enjoy them and find them useful.

Comment 16 comments  |  2 recs  | 

Do you like this story?

Comments

Display:

Why the BB's and K's per PA?

If all else is equal, should I really expect Miguel Cabrera’s K’s to increase by going to Houston? All of these are influenced to a degree by the players, obviously, but I feel like K’s and BB’s are pretty “park-neutral.” Is this assumption wrong or misguided?

Just glancing at the 2010 stats, if you break them down by division, it’s as follows in the AL:

AL East: 18.6% K, 8.6% BB
AL Central: 16.9%, 8.%
AL West: 17.7%, 8.5%

So the Central’s pitchers (these are just the pitching stats, by the way) struck out almost two percent fewer hitters – substantial considering AL Central teams combined to face 31041 hitters vs. the AL East teams sum of 30899.

So AL Central teams faced 142 more hitters and managed to strike out 486 less.

Even subjectively, the AL East has superior pitching (and teams) than the Central, so I’m not sure I understand why including K’s and BB’s. Does Patriot cover it in his article? I haven’t read it but you guys must know something I don’t considering both you and Statcorner cover it.

My Michigan State (and Big Ten) Baseball Blog.

Like music? See what I'm listening to at my Last.fm account.

by Mike Rogers on Jan 5, 2011 11:19 PM EST reply actions  

Responses

The alternative to BB/PA is to calculate it as BB/G. Since PA/G fluctuate with the run environment, if you used BB/G and K/G you could have a hitters park showing more K’s and BB’s than average, simply because there were more PA’s. Therefore, doing it as BB/PA and K/PA should help control for these confounds.

I’m not the first to report substantial park effects for walks and strikeouts. Gassko showed it a while back here:
http://www.hardballtimes.com/main/article/batted-balls-and-park-effects/
with some discussion of the phenomena.

I don’t know why some parks would be better for strikeouts or walks than others. Maybe the batter’s eye is better at some parks than others. Maybe the humidity and temperature of the air affect how balls break enough that it matters. But it’s a sizable effect, and seems somewhat consistent in where parks group out (Gassko’s data was 2003-2007, mine is 2006-2010, so admittedly there’s some overlap, but also some differences).

Now, one thing I definitely do not do here is regress to the mean. I’m not sure how volatile these data are—that’s really my next step with this stuff. Gassko said he did regress (though he gave no details), though, so the effect seems to hold up. I’d need to know that before I try to start saying what some player might do if he moved from one park to another—especially across leagues, where you’d encounter league quality differences!!

As for your stuff about K% and BB%‘s in the different division…I don’t think much can be said about it. You’ve got both different parks and different players in those comparisons. The beauty of the park factor is that you’re (ideally) getting an apples to apples comparison of how the same players do in one park vs. in other parks.
-j

by JinAZ on Jan 5, 2011 11:45 PM EST reply actions  

Shoot, that should have been a response to Mike.

Also, Patriot didn’t cover K and BB Park factors, but he did discuss using G’s vs. PA’s vs. other numbers in a section titled “what should be in the denominator?” I sort of followed that logic in posting these data.
-j

by JinAZ on Jan 5, 2011 11:47 PM EST up reply actions  

I guess I don't find it terribly useful.

Maybe I’m just grossly underestimating it. Like I said, you and Statcorner and apparently Gassko all all do it.

While batter eye’s and other things can effect it (as it can for defense; I remember Curtis Granderson saying early on that Comerica was one of the toughest parks in baseball to play CF in for some reasons), I guess I don’t see K’s and BB’s as being grossly bothered by the park.

This, by the way, was why I was so interested in your component park factors article. I was wondering if you used K’s and BB’s like I’ve seen elsewhere. I think it’s worth a look at how much stadiums change over the years in K’s/BB’s.

My Michigan State (and Big Ten) Baseball Blog.

Like music? See what I'm listening to at my Last.fm account.

by Mike Rogers on Jan 6, 2011 2:01 AM EST up reply actions  

I don't really see what your objection is

…except that it seems less intuitive than something like a home run park factor. The data show there is an effect (historically, at least), and it seems to be consistent enough that Gassko and I came up with very similar rankings for best/worst parks for K’s and BB’s (with a two-year overlap in our 5-year samples). Gassko’s data, at least, are regressed to account for volatility in the data, and there still was an effect on par with the size of the effect I report here.

As I said, the next step is to look at year to year correlations and such so we can get an idea of how consistent these effects are. Initially, my impression is that K and BB park effects are as consistent as most other factors.
-j

by JinAZ on Jan 6, 2011 8:34 AM EST up reply actions  

The way I look at it...

is that a park’s other, more prominent effects trickle down to BBs and SOs. If putting the ball in play is more valuable to a hitter, he’ll hack more. Or a pitcher might throw him fewer strikes. for example.

by Sky Kalkman on Jan 6, 2011 9:32 AM EST up reply actions  

The use is finally for doing projections

as Patriot points out in the linked article, you want component park factors (properly regressed, of course, so these are a “component” of that process) for estimating skills, you want run factors for estimating value.

Making watching baseball as fun as doing your taxes.
My Twitter feed.

by Matt Klaassen on Jan 6, 2011 4:00 PM EST up reply actions  

So if the player behavior idea is true, we'd expect negative correlations between R/G and K/PA or BB/PA

Pearson correlations between these components and R/G First row is the table header

Component Correlation Significance
PA/G 0.53 #
SO/PA -0.46 #
BB/PA -0.27
1B/BIP 0.32
2B/BIP 0.58 ##
3B/BIP 0.17
HR/BIP 0.57 ##
SB/G -0.24
CS/G 0.17
TB/BIP 0.90 ###
BIP/PA 0.46 #
The #’s indicate significance: # = P < 0.05, ## = P < 0.001, ### = P < 0.0001

Yes for strikeouts. But then again, the alternative explanation is that the park causing fewer strikeouts results in higher runs per game. So I guess this isn’t really a strong test of the idea..

Walks are not significantly correlated, but the effect is negative. Perhaps what’s going on there is that players are “trying to” walk less in high run environments, but there’s also park effects pushing that number up. Or something.

I still think the important question is whether there are consistent per-PA effects on walk and strikeout rates. I’m supposed to be working today, but I’ll try to tackle that if I can finish my work stuff.
-j

by JinAZ on Jan 6, 2011 10:34 AM EST reply actions  

Great post, Justin

Love this stuff… have wanted to do it myself for a while. Like a lot of other stuff, that will probably never happen.

What is your source of data? Retrosheet?

Making watching baseball as fun as doing your taxes.
My Twitter feed.

by Matt Klaassen on Jan 6, 2011 4:01 PM EST reply actions  

Thanks!

I just used b-ref splits home vs away. :)

by JinAZ on Jan 6, 2011 6:03 PM EST via mobile up reply actions  

LHB and RHB

Josh, in Patriot’s post he argues that we don’t have to, nay shouldn’t, calculate different values for left- and right-handed hitters in order to park-adjust a value stat.

This seems completely incorrect to me, as A) in asymmetrical parks (like Fenway) lefties and righties are essentially hitting in two different parks, and B) there is such a thing as a switch hitter. What is your take on this matter?

Blogger and Editor, Rational Pastime Blog. Twitter: @RationalPastime.

by J-Doug on Jan 6, 2011 4:10 PM EST reply actions  

Reason

For value, he’s saying that you’re trying to measure value of one player vs other players. The point of the park factor in that case is to just adjust for the standard effect of the park across all players, so hitters don’t as a group get an advantage over others if they play in a hitters park. If one handedness is better suited to a particular park than another, that player has real value over other players because of his handedness.

This is a case where measuring value is different from measuring talent. For talent (which is what projections try to get at), you want to use component park factors like these…or probably those that are broken down by handedness.
-Justin

by JinAZ on Jan 6, 2011 6:11 PM EST via mobile up reply actions  

Well put.

My example is usually Boggs in Fenway.. Sure, maybe he took advantage of the Monster more than the average player. But that had value to the Red Sox. They won more games because of him. For a value stat, I think you could skip the park factor applied to production and just use a different runs-to-wins conversion (which would be based on the runs park factor).

Once you start wonder how different players would have or will perform in other parks, then you want component factors, getting as granular as possible. Like Justin said, that’s a talent thing, not a value thing (well, it could be a “projecting value” thing, I suppose.)

by Sky Kalkman on Jan 6, 2011 9:14 PM EST up reply actions  

This is so frustrating to have to explain over and over again

although maybe if I explained it more clearly people wouldn’t get confused. Or maybe they just shouldn’t listen to my nonsense. Which they don’t. So I’ll stop typing now.

…but the interplay between what a player is projected to do in different parks and then what the value of those different projected lines would be is a fascinating subject.

Making watching baseball as fun as doing your taxes.
My Twitter feed.

by Matt Klaassen on Jan 6, 2011 10:20 PM EST up reply actions  

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Why be a member?

Follow us on Facebook!

Follow us on Twitter!

SaberGraphics

Yahoo_full_count

MLB Daily Dish

Get the latest MLB Trade Rumors, Transactions, and News at MLB Daily Dish!


Managing Editor:

Jbopp-kc_small Justin Bopp

Columnists:

Adam_small adarowski

Dme_small Satchel Price

Closeup4_small J-Doug

Carlosicon_small Julian Levine

Billy_and_daddy_4th_of_july_small Bill Petti

Featuring:

Dayton_small Jeff Zimmerman

12475953_small Jacob Peterson

Recent_pic_pg_small Patrick Gordon

Btbpro_small Dave Gershman

Me_small Bryan Grosnick

229331_10150183361996591_674441590_6760167_6637860_n3_small Lewie Pollis

Img_3830_small David Fung

30472_1481067225243_1190689185_1381415_997334_n_small Glenn DuPaul

1mnvxku7_small joshuaworn

Set_small MattFilippi18

Photo0011_small Nathaniel Stoltz