/cdn.vox-cdn.com/uploads/chorus_image/image/19419307/gyi0065233147.0.jpg)
PITCHf/x is an amazing system. For all of the amazing data that it provides us, it is useful to keep in mind that there is error inherent in the data that we see and use. This error stems from a number of sources.
First there is measurement error. This would include causes such as inaccuracy in the existing camera registration when a given pitch is tracked as well as error extracting the center of the ball from the captured image. An additional point that makes perfect sense, but that we think about even less, is that measurement error in a given stadium is not a constant throughout a season. Camera components may slowly sink or shift in some way over time or may be impacted by climate conditions on game day. As I understand it, camera setups are typically re-calibrated periodically throughout the season in each park, which can lead to discrete points in time where measurement error may change. Mike Fast has clearly proven that park effects are not constant throughout a season.
Another form of error is fitting error. From the set of positions measured during the flight of the pitch, a best fitting algorithm is used to model the pitch in the measured space and then extrapolate and arrive at pitch characteristics like the position where the pitch crosses home plate as well as the starting velocity at the initial location. The initial location is actually fixed at 50 feet from home plate, so is not really a "release point", which in fact would be further from home plate than this even for Randy Johnson. The region of interest measured by the cameras is also not identical from park to park. This region would at least partly be defined by the location of one of the high cameras in the stadium, which can be closer to home in some parks versus closer to third in others. Expanding the region of interest would drive the ratio of measured space to extrapolated space higher, which I would expect would lead to more accurately reported values.
It is also commonly accepted that while the MLBAM pitch type classifications are always improving and very good, there are cases where pitches may be binned in the wrong group.
For all of the reasons stated above and more, what I'm about to present are very simple pitch velocity park effects witnessed in the 2013 season around the league. The method used to derive a park effect for a particular team X is simply to average the velocities of all pitches classified as a type of fastball by PITCHf/x for both teams in home games played by team X and then subtract the average velocity of all pitches classified as a form of fastball for both teams in away games played by team X.
Note that by using this simple method, I am also not controlling for pitchers, so it will be the case that individual pitchers included in this roll up will have thrown more fastballs at home than away or vice versa.
Without further ado, here are the average park effects for fastballs thrown in the 2013 season (for all games up August 20):
Team | Average PITCHf/x Fastball Park Effect (mph) |
---|---|
St. Louis | 0.71 |
Atlanta | 0.58 |
Toronto | 0.58 |
Chicago (AL) | 0.54 |
Los Angeles (AL) | 0.47 |
Cleveland | 0.43 |
Washington | 0.41 |
Seattle | 0.35 |
Miami | 0.27 |
Philadelphia | 0.26 |
Los Angeles (NL) | 0.23 |
Arizona | 0.18 |
Tampa Bay | 0.15 |
New York (AL) | 0.15 |
Kansas City | 0.09 |
Houston | 0.01 |
Pittsburgh | -0.03 |
Milwaukee | -0.13 |
Texas | -0.15 |
New York (NL) | -0.24 |
Oakland | -0.26 |
Boston | -0.32 |
Cincinnati | -0.38 |
Minnesota | -0.39 |
Chicago (NL) | -0.39 |
San Diego | -0.45 |
Baltimore | -0.52 |
Colorado | -0.64 |
San Francisco | -0.64 |
Detroit | -0.89 |
PITCHf/x pitch velocity park effects, 2013 (all games up to August 20)
The results show a little more than a mile and a half per hour difference between the fastest park in St. Louis and the slowest park in Detroit. After several seasons of leading the pack as the fastest stadium, Kansas City shows up close to level this season in this study.
As opposed to just accepting the results above, it would be an interesting exercise to consider why these park effects may be visible. After all, I have talked about a number of error sources, but are there other real factors that can influence reported velocity?
Consider temperature for a moment. By using the same technique as above to calculate the average difference in temperature in home and away games, I can see that exactly half the league has played in warmer games at home, while the other half have played in cooler games at home. Ten of the slowest eleven parks in the velocity list above (from the Mets down, with Baltimore being the outlier) have involved locations where the games at home were cooler on average. In fact San Francisco and Detroit showed the first and second largest negative temperature deltas at home, at -10.7F and -7.3F, respectively. Texas might as well have their home and away uniforms made out of different materials for comfort, as they have played in temperatures on average 15.0F warmer at home than when on the road this year.
From Alan Nathan's wonderful Pitch Trajectory Calculator, it is possible to see that all other things equal, a pitch will travel faster in warmer air than cooler air. The physics behind this is that as temperature drops, air density rises, creating more drag on the ball to slow it down. The PITCHf/x system would capture this effect inherently in its pitch measurements.
It is tempting to say from these observations that temperature is playing a role in these velocity differences around the league. The problem here, is that while this is a nice narrative, it is not obvious why temperature should really affect these velocity measurements.
Remember that the velocities reported are meant to be initial velocities. If what is reported truly is the initial velocity of the pitched ball, it will have not yet traveled through the denser air that leads to this effect. It is known however that what is provided is not actually the release velocity, given that a 50ft distance from home plate is reported, which is not a true representation of where each pitcher is releasing the ball. The reported start velocity is also derived based on a nine parameter fitting model, which makes the assumption that the acceleration of the ball is constant throughout its path to the plate. This assumption is made for algorithm simplification purposes, and while it works very well is still a simplification of the truth.
Alan Nathan and Peter Jensen discussed this topic as subject matter experts in an earlier absolutely fascinating discussion on The Book Blog. In this comment, Alan Nathan describes the effect of using a constant acceleration model as systematically underestimating the initial velocity of pitches. In this other comment, he gives an example of the order of magnitude of this underestimation, which is quite small (less than 0.1mph).
It seems to me that if two pitches were actually released at the exact same velocity, but one at 80F and one at 60F, the pitch thrown in the cooler temperature would decelerate more quickly such that its average acceleration used in the nine parameter fitting model would be lower. If that is true, then its extrapolated initial velocity would be underestimated to a greater degree, meaning that it would be reported with a lower velocity than the pitch that was released at the same speed only in warmer weather. Of course the magnitude of this difference would likely be even smaller than the 0.1mph delta above, making this a fairly negligible factor at the level of these park effects.
So while there is some minor underestimation of start velocity on pitches as a result of fitting error due to an assumed constant acceleration model, it doesn't seem like these effects alone should make much of a dent in these velocity park effect numbers. The same can be said for other environmental conditions that affect air density such as elevation, air pressure and humidity.
Another potential impact temperature could have is by directly affecting the calculation of the position of the ball at any given time from the multiple camera images. I have worked with sets of cameras in the past in another setting where such positions needed to be measured, and certainly maintaining fixed relative camera positions after they have been registered and aligned with one another to form a common frame of reference is a challenge as temperature changes. This would be a fascinating area of study, but this level of raw data is not provided with the current set of PITCHf/x data released to the public.
So is there something about pitching in warmer conditions that actually helps a pitcher throw the ball harder right out of his hand? Is the pitcher's arm speed very slightly lower when temperature falls and air density rises? This is one interesting question that I have but cannot answer. Athletes stretch and warm up to improve flexibility and to prepare for game action. It seems reasonable that it would be easier to keep the body warm and loose if the temperature was doing some of the warming for the athlete. Running a regression of fastball velocity and game temperature shows barely a microscopic correlation, albeit positive. The data presented here is not sufficient enough to make a claim of this sort or refute it. Certainly more seasons would have to be analyzed before even confirming that this is a trend worth further examination.
Prevailing wind directions are another potential source of park effect. While wind speeds and directions are recorded for each game, an issue is that the level of reliability of these numbers at field level is not guaranteed, as I believe the wind information is measured closer to the top of the stadium. For the same reasons as discussed above, wind should really only be expected to affect reported velocities very subtly as well.
For most practical purposes, velocity differences of the magnitude discussed here are not relevant. When you watch your favorite pitcher throw a pitch recorded at 91.1mph, you probably don't care too much if it really was 91.3mph. In fact, you probably only saw on the screen that he threw 91mph anyway.
Where I could see the level of detail of correcting for fitting error and any other systematic park-induced effects might be necessary is on a game-by-game level if trying to look at things like velocity or release point variation for potential injury detection/prevention purposes. Using an admittedly-still-simple technique of correcting PITCHf/x data within each game by using a weighted average of all pitcher data from their away games over the course of a season, I have identified games in the past with abnormally high reported differences to expected norms. These are the sorts of things that must be corrected to attempt some fun things like injury detection/prevention.
Credit and thanks to Baseball Heat Maps for PITCHf/x data upon which this analysis was based.
You can follow me on Twitter at @MLBPlayerAnalys. <a href="https://twitter.com/MLBPlayerAnalys" class="twitter-follow-button" data-show-count="false">Follow @MLBPlayerAnalys</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs");</script>