When we last left the topic of hitter volatility, I provided folks with a leader board based on the estimated Volatility of batters in 2011. This was on the heels of providing some initial findings around year-to-year correlations for this measure.
The idea behind year-to-year correlation was to determine whether or not Volatility is a "skill", meaning, do some hitters posses the ability to perform more consistently day-to-day than others. Those initial results were encouraging, showing an R of .60, .52 if we restricted it to just those players with >=300 PA's.
But I needed more data to truly confirm that there was something to this. So I got it.
Sean Forman of Baseball-Reference was kind enough to lend me a hand. As a result, the sample for the analysis now includes 1228 cases. I also altered how Volatility is calculated (more on that below--big old Tip O' the Hat to Tom Tango for some guidance here as well).
The correlation of Volatility year-to-year is now only .23. The data included the 2005 through 2010 seasons and only included batters that had >=300 PA's in the current and prior years.
For some perspective, you typically want to see something with an R of .70 (R2 of .49) in order to get real excited. The .60 and .52 were encouraging before, but at .23 I'd say there isn't much to it. We actually below pitcher ERA territory here.
Part of this is gaining access to more data, but part of it is altering how Volatility is derived.
This time around I took some inspiration from Seth Samuels who published on this topic (really, streakiness, but it was all similar work) at the same time that I published Part I. Seth didn't just look at 10-game averages of wOBA, but looked at the difference between game wOBA's and the player's seasonal wOBA. The more I thought about this the more I liked the approach. For one, what I was doing was taking an average of an average and that's just not advisable mathematically.
So now Volatility is calculated by taking the standard deviation of the difference between a hitter's wOBA in each game and their seasonal wOBA. After some discussion with Tango I decided not to normalize Volatility in the way I did previously where the concern was higher wOBA hitters were coming out as more volatile. The correlation between seasonal wOBA and coefficient of variation in this data set is .14, whereas it is even lower for standard deviation (.05).
So here is the actual Volatility leader board for 2011 (through August 23rd):
Michael Bourn has been the least volatile hitter so far this year. This means that his performance on any given day varies less (positively or negatively) from his overall average performance.
Gordon Beckham has been the most volatile. Not only does he sport a poor .284 wOBA, but with Beckham you see, on average, greater variation from his average on any given day.
Another finding so far is that the correlation between plate appearances and Volatility is only .08. This is interesting in that we would expect Volatility to increase as players come to the plate more infrequently. However, across the data set the effect seems non-existent.
I hope folks are enjoying this series and finding it somewhat useful. I plan or re-running the analysis regarding what "causes" Volatility. Even if hitters aren't inherently more or less volatile there may be certain attributes that contribute to one's in-season Volatility (e.g. a spike in K%, decrease in BB%, etc.), and that would be useful to know.
Additionally, even if Volatility in year one is not a good predictor of Volatility in year two it's still interesting to know how consistent hitters are relative to their peers in a season. There are lots of metrics we use to measure a player's worth that aren't consistent year-to-year.
I plan on also tackling starting pitchers next, so stay tuned.