My last post asked whether a batter is more effective after he fouls off a pitch than when he swings and misses. I examined the question looking back at the plate appearance and assuming each pitch was independent of the previous pitches.
This doesn't appear to be a valid assumption. If David Ortiz rips a ball just foul over the Monster, a pitcher is going to be a little more careful than if he swung over a pitch in the dirt. It seems likely that the pitch selection is determined by the batter's reaction to the previous pitches*. To counter that, I wanted to follow the at-bat pitch by pitch and see how the matchup evolved.
* I apologize for using the Pozterisk but I've got a few tangents to wind through here. Doesn't the game theoretical nature of these matchups, and baseball in general, limit the usefulness of Markov chains, which assume that how you got to a state isn't important? I suppose over the long run you can assume things even out, but I wonder how true that is.
Speaking of tangents, I realized while running this study that the pitch sequences available from Retrosheet look like they're way off. I found over 20% for each year this decade where there were fewer pitches in the pitch sequence than there were pitches in the at-bat (if you have a 3-2 count, there's at least 6 pitches in the at-bat). Has anyone else noticed this? Any idea what happened?
Hopefully I didn't bore too many of you with that little mental diversion and you're sticking around to see the results. Rather than a tabular form, I thought I'd punch it up a little bit. Unfortunately, I can't figure out how to embed the results for 2007 directly in the page, so you'll have to click through.
As a tease, and to further incent you to actually visit the page with the results, let me post a few of the images you'll see on that page.
I bet those got your attention. Click on the results, browse around for a bit, and then come back and we'll talk.
To clear up some confusion, plate appearances indicate the number of plate appearances that went down that path, while total plate appearances is the sum of plate appearances further down the tree. Also, you should note that I stop the tree after a foul ball with 2 strikes. So even though there may be multiple pitches after a 1-2 foul ball, you don't get to see any of them, just the rollup of the data. If anyone's interested in the full breakdown let me know and I can send it out.
I don't have a lot to add to the actual data. As they say, a picture is worth a thousand words, so I'll pretty much let it speak for itself. Swinging and missing the first pitch is substantially worse than either having the pitched called a strike or fouling it off. That pattern seems to hold throughout, which backs up the results I found looking backwards at the at-bat.
Of course the outcome of these studies doesn't explain why a foul is better for the batter than swinging and missing, although we can guess. My feeling is that in general a foul indicates the batter is pretty much on the pitch and just missed by a little. To me, those things suggest that he's likely to make good contact later in the at-bat.
I think there's a lot of possiblities with this sort of analysis, including pitch type and location to this sort of study to try and better figure how the batter/pitcher mental matchup plays out. One of the main anti-sabermetric rants is that we overlook the personal and treat players like machines. Looking deeper into the game theory aspects is a way of addressing that sometimes very real concern and opening up a new realm of analysis.