Beyond the Box Score: An SB Nation Community

Navigation: Jump to content areas:



Sports blogs for fans, by fans.
Around SBN: Steve McNair Found Shot to Death


What factors have an effect on runs scored at MLB Parks? Part 2

Question:"> What factors have an effect on runs scored at MLB Parks? Part 2

Why I asked the question: I was trying to find why Chase Field had such a high Park Factor over the years and the research expanded to all stadiums from that point.

Analysis: I did some research on effects on all MLB and MiLB stadiums (link to spreadsheet). After it was published, someone brought up that Chase Field had such a high Park Factor and elevation and temperature could not explain the entire difference. Being unable to tell if it was the lower humidity (as some suggested) or park size, I decided to run a multiple regression for the average park factors over the last 3 years (2006 to 2008) against elevation, average temperature, average humidity, park size (RF, RC, CF, LC, LF), average wall height, errors per game, wind direction, surface type and foul territory area.

Star-divide

Explanation of source of data:

Park Factors – I originally used runs scored per game because of the only Park Factors I could find did not go to the decimal point. After my original article ran, I got flooded with many different sets of data and am using Patriot's numbers from his website (htttp://gosu02.tripod.com/id103.html)

Elevation (ft)- Collected from the List of Major League Baseball stadiums on Wikepedia – Elevation is a major factor in determining the distance a ball travels and the time the defense has to react to the ball once it is in play.

Temperature (degrees F)– Collected from the retrosheet database – last 3 years, except only 1 year's worth of data for Washington. The higher the temperature the farther a ball will travel

Relative Humidity (percentage) – Average values from April to September from websites BBC.com and CityRating.com Humidity is not supposed to have much of an effect on the distance a ball travels, but maybe that that small differences will explain some differences.

Dimensions (ft) – Taken from Wikepedia. Only 5 sets used – LF, FC, CF, RC and RF. Originally used total area from ballpark, but I found out that even though these two stadiums that had about the same area the stadium shaped like #1 below had a higher park factor:

Stadium #1 360-380-400-380-360

Stadium #2 380-380-380-380-380

Park Foul Area (ft squared) – Areas were calculated by Mitchel Lichtman. The larger the foul area, more foul balls will be caught, therefore less runs scored

Wind strength and direction (mph) - I used Retrosheet data from the last 3 years (1 year for Washington). The data from Retrosheet comes in the for of 8 different directions. From these different directions, I created the following matrix:

Wind Direction

X component

Y component

To LF

-0.71

0.71

To CF

0

1

To RF

0.71

0.71

LF to RF

1

0

From LF

0.71

-0.71

From CF

0

-1

From RF

-0.71

-0.71

RF to LF

-1

0

I multiplied the X and Y values by the wind speed, added all the wind values up for each component and then divided by the number of games. Y component is a wind blowing out to CF, while the X component is a wind blowing to RF.

Question: When collecting this data, I found there was no wind blowing in form right field and I thought I had made a mistake somewhere. I searched on the games database for hat wind direction and the most recent case was in 2003. Has the wind not once over 5 years blown in from 1 game from RF? Is there some unspoken rule that the scores don't mark it this way?

Opponents Errors per game – I was looking for a way to measure how tough it is to play in a Stadium (i.e. Fly balls in Metrodome). The best metric I could come up with is to average the amount of errors the opposing team has per game.

Playing Surface – The three stadiums with Turf were given a value of 1 and the rest 0. Being that it was the new Field Turf, I wondered if runs scored might go down because the balls hit would be slower than "AstroTurf" and less weird bounces.

Average Wall Height (ft) – Averaged the values ballparks.com Now it is time for a few graphs that show the data collected. *

Note: Data on the Washington Nationals is only from 2008 since they just moved into a new park this last year.

The initial data is for each of the major league parks is location in the following  spreadsheet that can be downloaded.

FactorsForParkFactors

I ran a regression analysis on the data to get an equation that uses the preceding data. The regression equation ended up having an R-squared of 0.714 and the Standard Deviation of the difference of the initial Park Factor and the final Park Factor was 0.0178. There was two problems with that initial equation:

  1. The variable for wind blowing to CF was negative, therefore the more the wind was blowing out, less scoring that would be. That just defies all logic, so I threw both the Wind Components out for the next round of analysis

  2. The variable for Wall Height was positive, meaning the higher the wall, the more runs that are scored. Home runs score more runs than doubles, so I decided to remove Wall Height also.

After rerunning the regression after removing Wall Height and Wind, I got the following equation Standard Deviation of 0.0.0184 and R-squared of 0.692:

Park Factors = Away Teams Errors per game * (0.016) + % Relative Humidity * (-0.0012) + Foul Area * (-0.00000061) + Elevation * (0.000021) + Average Temperature * (0.00077) + Left Field * (-0.0010) + Left Center Field * (-0.00063) + Center Field * (-0.0010) + Right Center Field * (-0.00020) + Right Field * (0.0011) + 0.0090 (if Surface is turf) + 1.7056

Here is a simple chart of the factors for easy comparison of the factor and how much effects the park factor and run scoring environment. 

The amount each factor has on Park Factors and Runs Scored (9.54 runs per game was the average runs scored by both teams over the past 3 years

Factor

Change in Park Factor

Change Runs Scored per game (9.54 runs per game)

10 degree F increase

0.0077

0.073

Increase in RH by 10%

0.0120

-0.115

10,000 sq ft increase in foul area

-0.0061

-0.058

Surface is Turf

0.0090

0.085

1000 ft increase in elevation

0.0206

0.196

1 Errors for Away Team

0.0160

0.150

10 ft increase in LF

-0.0100

-0.095

10 ft increase in LC

-0.0063

-0.060

10 ft increase in CF

-0.0101

-0.096

10 ft increase in RC

-0.0020

-0.019

10 ft increase in RF

0.0106

0.101

As it can be seen, each factor can significantly effect the runs scored.

The following table is the original and final numbers for each of the ballparks. I also have added a column of combined stadium attributes (Dimensions, Foul Area and Surface Type) added to the equation's constant value to help to show which stadium designs lead to more runs.

Team

Park

Original Value

Dimen plus constant

Projected Value

Diference

Original Value

Dimen plus constant

Projected Value

Difference

Arizona Diamondbacks

Chase Field

1.0505

0.9930

1.0642

-0.0136

10.0214

9.4725

10.1512

-0.1298

Atlanta Braves

Turner Field

0.9934

0.9862

1.0090

-0.0157

9.4758

9.4076

9.6254

-0.1495

Baltimore Orioles

Oriole Park at Camden Yards

0.9946

0.9794

0.9875

0.0070

9.4874

9.3425

9.4202

0.0672

Boston Red Sox

Fenway Park

1.0313

1.0256

1.0183

0.0130

9.8382

9.7836

9.7137

0.1245

Chicago Cubs

Wrigley Field

1.0257

1.0059

1.0127

0.0130

9.7841

9.5951

9.6603

0.1238

Chicago White Sox

U.S. Cellular Field

1.0285

0.9912

0.9958

0.0328

9.8115

9.4557

9.4987

0.3128

Cincinnati Reds

Great American Ball Park

1.0150

0.9986

1.0096

0.0054

9.6825

9.5258

9.6307

0.0518

Cleveland Indians

Progressive Field

0.9819

1.0028

1.0087

-0.0268

9.3667

9.5659

9.6227

-0.2560

Colorado Rockies

Coors Field

1.1066

0.9729

1.1056

0.0010

10.5559

9.2808

10.5461

0.0098

Detroit Tigers

Comerica Park

0.9838

0.9589

0.9702

0.0136

9.3846

9.1474

9.2551

0.1295

Florida Marlins

Dolphin Stadium

0.9667

0.9863

0.9859

-0.0192

9.2217

9.4083

9.4045

-0.1828

Houston Astros

Minute Maid Park

1.0006

0.9983

0.9881

0.0125

9.5452

9.5230

9.4258

0.1194

Kansas City Royals

Kauffman Stadium

1.0033

0.9874

1.0001

0.0033

9.5710

9.4195

9.5399

0.0312

Los Angeles Angels of Anaheim

Angel Stadium of Anaheim

0.9774

0.9960

0.9860

-0.0086

9.3238

9.5010

9.4059

-0.0821

Los Angeles Dodgers

Dodger Stadium

0.9751

1.0065

1.0030

-0.0278

9.3021

9.6016

9.5677

-0.2655

Milwaukee Brewers

Miller Park

1.0031

0.9947

0.9963

0.0068

9.5688

9.4882

9.5039

0.0649

Minnesota Twins

Hubert H. Humphrey Metrodome

0.9971

0.9839

0.9981

-0.0011

9.5111

9.3856

9.5214

-0.0103

New York Mets

Shea Stadium

0.9754

0.9900

0.9896

-0.0142

9.3046

9.4437

9.4396

-0.1350

New York Yankees

Yankee Stadium

0.9879

0.9700

0.9704

0.0176

9.4243

9.2535

9.2568

0.1675

Oakland Athletics

Oakland-Alameda County Coliseum

0.9836

0.9982

0.9840

-0.0004

9.3831

9.5223

9.3865

-0.0034

Philadelphia Phillies

Citizens Bank Park

1.0273

1.0128

1.0167

0.0105

9.7994

9.6613

9.6989

0.1006

Pittsburgh Pirates

PNC Park

0.9913

0.9866

1.0026

-0.0113

9.4561

9.4113

9.5641

-0.1080

St. Louis Cardinals

Busch Stadium

0.9816

0.9834

0.9951

-0.0135

9.3637

9.3811

9.4929

-0.1292

San Diego Padres

PETCO Park

0.9248

0.9724

0.9539

-0.0291

8.8215

9.2756

9.0992

-0.2777

San Francisco Giants

AT&T Park

0.9977

0.9894

0.9769

0.0208

9.5172

9.4380

9.3189

0.1983

Seattle Mariners

Safeco Field

0.9632

0.9962

0.9925

-0.0293

9.1884

9.5027

9.4680

-0.2797

Tampa Bay Rays

Tropicana Field

0.9878

1.0118

1.0063

-0.0186

9.4225

9.6519

9.5997

-0.1772

Texas Rangers

Rangers Ballpark in Arlington

1.0497

0.9946

1.0096

0.0401

10.0137

9.4877

9.6312

0.3825

Toronto Blue Jays

Rogers Centre

1.0220

1.0022

1.0023

0.0197

9.7490

9.5604

9.5615

0.1875

Washington Nationals

Nationals Park

1.0070

0.9928

0.9949

0.0120

9.6058

9.4707

9.4910

0.1148

The regression equation is able to predict some stadiums run production quite well. Here is a table where the regression was able to predict the Park Factor within 0.01.

The regression equation is able to predict some stadiums run production quite well. Here is a list of stadiums where the regression was able to predict the Park Factor within 0.01.

Angel Stadium of Anaheim    -0.0086

Hubert H. Humphrey Metrodome    -0.0011

Oakland-Alameda County Coliseum    -0.0004

Coors Field    0.0010

Kauffman Stadium    0.0033

Great American Ball Park    0.0054

Miller Park    0.0068

Oriole Park at Camden Yards    0.0070

I grouped the parks that exceeded the Standard Deviation of 0.0184. These are the stadiums that the factors I am using can't explain the runs scored at that stadium.

Seattle Mariners    -0.0293

San Diego Padres    -0.0291

Los Angeles Dodgers    -0.0278

Cleveland Indians    -0.0268

Toronto Blue Jays    0.0197

San Francisco Giants    0.0208

Chicago White Sox    0.0328

Texas Rangers    0.0401

Using the preceding data we can do analysis on future parks. I will pick the Met's new stadium, Citi Field. Most of the natural effects will be the same and the errors aren't know yet, but we can look at the dimensions and foul area to come to some conclusion.

Feature Shea Stadium Change in PF Citi Field Change in PF Difference (Citi-Shea)
LF 338 -0.34 335 -0.33 0.010
LC 371 -0.24 379 -0.24 0.000
CF 410 -0.41 408 -0.41 0.000
RC 371 -0.07 383 -0.08 -0.010
RF 338 0.36 330 0.35 -0.010
Foul Area 25665 -0.02 20900 -0.01 -0.003
Total -0.013

The new Mets stadium looks to allow less runs per game than the previous one.  If  the 9.54 runs per game environment is used, it would allow 0.12 runs less per game or about 10 less runs over the entire 81 home games.

I have had a lot of help putting this study together and special thanks to Mitchel Lichtman and Patriot for providing and data and to Sky Kalkman for his many suggestions. I hope the data gives people more of an insight to various variables that go into a stadium and how much of an effect each variable has on the run scored environment.

0 recs | Comment 1 comment

Story-email Email Printer Print

Around SB Nation

Sprints // 03.18.09

Mar 2009 from Team Speed Kills - 2 comments

Comments

Display:

There's a RFer Problem in MLB?

On a quick glance—it deserves more—the only number that has the right sign in the OF is RF (greater distance, greater runs scored). (Yes, I had this problem with the earlier data, but this revision let’s us focus on the issue, since all the other signs are negative.)

Since in makes no sense that a size increase would decrease runs (OFer A can cover X sq. ft. in Y seconds. If the total area he needs to cover goes from Z to Z+10, more outs become hits, more singles become doubles, etc.), we have to conclude that LF and CF are played by people who have more ability than the positions generally require (they are suboptimally allocated, so they can still cover more space than they have to; let us take this as a good thing).

RF, otoh, has reached a talent peak and moved to the point where the sign is intuitively correct—RFers are playing at their peak and then some, and so cannot reach balls in that extra area.

(This is somewhat corroborated as well by the difference between LC and RC.)

Probably won’t make more mobile RFers the LHPs of the late Noughts, but does seem to indicate that a team with the option should move its second CFer to RF and see what happens.

by klhoughton on Jan 8, 2009 9:06 AM EST reply reply actions actions   0 recs

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Start posting on Beyond the Box Score »

Join SB Nation and dive into communities focused on all your favorite teams.

FanPosts

Community blog posts and discussion.

Recent FanPosts

P1_iannetta_small
Matt Herges: The reason we look at peripherals
Small
SLG and Speed
Small
Interleague Attendance Nonsense
Limes_125_small
A Note About Becoming a BtB Author: Contributing to the Community Helps
Small
Is Adrian Beltre underrated...?
Limes_125_small
How Do You Like the New Daily Link Roundup Posts?
Stlouiscardinals_small
Depth Charts Help
Zorilla_small
How Do You Measure a Pitching Coach?
Limes_125_small
Looking For SQL & Tech Geek Help For Collaborative Projects
Small
When do MLBers get paid...?

Post_icon New FanPost All FanPosts Carrot-mini

FanShots

Quick hits of video, photos, quotes, chats, links and lists that you find around the web.

Recent FanShots

There's your human element.  Why, when the technology is readily available, are humans still calling balls and strikes?
Fire Jim Leyland: Fu-Te Ni Follow Up; Concern Over Big Three?
Rany Gets Banned By the Royals
Contract Retrospective: Vernon Wells' 7-year, $126 Million Contract
The Rockets are innovative
Flip Flop Fly Ball
Yanks Considered Trading Rivera For Wells In '95: MLB Rumors - MLBTradeRumors.com
Bullpen Usuage Charts for Last 5 days
MiLB Game of the Week
Drive Mechanics Looks at Chris Perez's Mechanics and Pitch F/X numbers

Post_icon New FanShot All FanShots Carrot-mini

Most Commented

Subscribe to BtB via Email

Enter your email address:

Delivered by FeedBurner

BtB Goes Social

BtB on Facebook

BtB_Sky on Twitter


Managers

Mos-def-the-ecstatic_small R.J. Anderson

Limes_125_small Sky Kalkman

Editors

Rawlings_baseball_bigger_small Dan Turkenkopf

Face_small Harry Pavlidis

770insig_small Jeff Zimmerman (TucsonRoyal)

Rickstache_small erik

Authors

Jinaz-reds-avatar_small JinAZ

Hms_surprise_small Graham

Wisc19cropped2_small jhmoore

Raysring1_small Tommy Rancel

E52205a2_small tbsmkdn

Official Partner of Yahoo! Sports