clock menu more-arrow no yes

Filed under:

Using Plate Discipline to Estimate Walk and Strikeout Rate - Corrected

New, comments

For a while, I have wondered if pitch data could be used to estimate a player's walk and strikeout rates. At, they display the percentage rate for pitches swung at and hit inside and outside the strike zone for each player (O-Swing%,Z-SWing%,O-Contact%,Z-Contact%). Using multiple variate regression, I took the 4 variables (outside swing and miss, outside contact, strike zone swing and miss, strike zone contact) and compared them to strikeout and walk percentages.

For this first run, I looked all the qualified hitters (500 PAs) from 2009. For the strikeout percentage, I got a r-squared of 0.89 and a standard deviation of 2.0% on the difference from the projected and final values. For the walk percentage, I ended up with an r-squared of 0.63 and a standard deviation of 2.0 on the difference from the projected and final values.


I went to look through this dataset and saw that some players had an actual much higher actual walk rate vice projected from a 6% to 8%. These players were all great hitters (Fielder, Pujols, A, Gonzalez) and it dawned that IBB was included in the walk rate and I needed to factor it in. I included a fifth variable in the walk calculations, IBB/PA and re-ran the regression. The results were much better. With an r-squared of 0.79 and the standard deviation of 0.15%. The highest percentage difference was 4% vice 8%. Here are the equations for estimating walk and strikeout rate:

SO% = ((-0.0407*O-Swing%)+(-0.2417 * Z-SWing%)+(-0.2429*O-Contact%)+(-0.8765*Z-Contact%) + 1.2885)*100%

BB% = ((-0.4134*O-Swing%)+(-0.0328*Z-SWing%)+(0.0216*O-Contact%)+(-0.2595*Z-Contact%)+ (1.7203*IBB per PA)+0.4217)*100%

Using these values, here are the players that I looked at the most deviate from the estimate and could be due for a correction in 2010:

Name 2010 Team 2009 Walk Rate 2009 Estimated Walk Rate Estimated – Actual
Ichiro Suzuki Mariners 4.7% 8.4% 3.7%
B.J. Upton Rays 9.1% 11.9% 2.8%
Franklin Gutierrez Mariners 7.3% 9.9% 2.6%
Jason Kubel Twins 9.7% 12.0% 2.3%

Ben Zobrist Rays 15.2% 11.8% -3.4%
Nick Swisher Yankees 16.0% 12.5% -3.5%
Kosuke Fukudome Cubs 15.4% 11.6% -3.8%
Nick Johnson Yankees 17.2% 13.3% -3.9%

Name 2010 Team 2009 Strikeout Rate 2009 Estimated Strikeout Rate Estimated – Actual
Brian Roberts Orioles 17.7% 12.2% -5.5%
David Wright Mets 26.2% 20.7% -5.5%
Alfonso Soriano Cubs 24.7% 20.3% -4.4%
Kevin Youkilis Red Sox 25.5% 21.2% -4.3%

Yadier Molina Cardinals 8.1% 12.2% 4.1%
Hunter Pence Astros 18.6% 23.0% 4.4%
Brandon Phillips Reds 12.8% 17.2% 4.4%
Yunel Escobar Braves 11.7% 16.5% 4.8%

I like the initial results and I am planning to add a few more years worth of data to get a better equation. I can see this formula being used to see if changes in walk and strike out rates is because of changes in plate discipline or just noise in the data.