Which tells us more: The last 7 at bats or 7 at bats against this pitcher?
Suppose that a batter is 5-for-7 over the course of his last seven plate appearances. Pretty good two-game stretch, eh? Now, suppose that another player has a career 5-for-7 mark against the day's starter, going back a few years. (You could substitute "the reliever who was just brought in" if you like.) Which is the more meaningful, the last seven PA's chronologically, or those seven PA's against the pitcher?
The proper answer to this is "Who cares? Seven plate appearances isn't enough of a sample to tell us much of anything." Let's leave that aside for a moment. Let's enter the magical world inhabited by magical gnomes, Joe Morgan, and more MLB managers than I care to mention. Let's pretend that you can actually draw serious inferences about future performance from seven plate app... I almost got through that with a straight face.
The bigger question is which is the better source of information: performance in the recent past against a different set of pitchers or performance against this specific pitcher over an array of years, given an equal number of plate apperances in each case. I suppose it's an empirical question and one that I haven't really seen a study of directly. I've seen a few things that lead me to believe that neither is a particularly good predictor, and so the "winner" between those two would be the winner of the "tallest grasshopper" award.
I don't care so much about which really is the better predictor right now. What amazes me is that managers, I presume in the absence of systematic study, have very clearly made up their minds. Consider how often you see managers make strategic personnel decisions (whether pinch hitting or moving someone around in a lineup or starting a guy who's normally on the bench) because "he has had success against Smith in the past." It's rare-er to see a manager do something like move a player up in the lineup because he's had a couple of good days in a row. (It does happen... just not as much.) Managers seem to value pitcher vs. hitter matchup stats. Now, if the reason that our batter had such great success against Smith is that Smith is a right-handed fastball guy and our batter eats those guys for breakfast, then that's fine. Of course, the problem is that over 7 PA's, anyone can get lucky against anyone, and to interpret that as anything meaningful is just silly.
But if it's just a matter of not understanding small sample sizes, we should see a great deal more re-shuffling of lineups based on recent performance. Managers seem to know, whether explicitly or intuitively, that the last seven PA's don't tell you too much about a player and so there's no need to change your strategy based on that. Why is it that managers understand small sample sizes in one situation, but not in another? There has to be another force at work.
I think there's something special in the manager's mind when it's batter vs. pitcher, and I think it's something that's a systematic error in thought across a lot of what passes for baseball analysis in the public sphere, something I like to call the virtus fallacy. Virtus is a Latin word which is hard to translate properly. It is the immediate predecessor of the English word "virtue", but in Latin, there's an added dimension. Consider that the Latin word for man is vir. Virtus was a mixture of courage, power, and honor that a Roman male citizen strove for. It was a characteristic that was "demonstrated" by some noteworthy act, often athletic or military. It might even be best translated as "manliness." The fact that you did well showed that you had virtus and if you lost, well... you just weren't the virtus kind of guy.
Following this logic, if a batter goes 5-for-7 against a pitcher, it must be because the batter is more of a man than the pitcher. And while it might be silly to believe that a plate appearance from five years ago when both pitcher and batter were five years younger in body and skill level would have a bearing on what's about to happen here, people generally assume that virtue is unchanging. If you were more virtuous five years ago, you're probably still the better man today. So, I should pinch hit you in this key situation. If a batter is 5-for-7 over his last few at bats, he probably did it against four different pitchers, so there's no chance for him to "prove" that he is more virtuous than any of those pitchers. He might be (and probably is) a good player or maybe he just had a good couple of days. But he's not proven that he's better than a specific other man.
So, the next time you see a manager base a key decision clearly on batter vs. pitcher matchup data (and make a hilarious strategic blunder), it might not be evidence that he is statistically challenged. Well, it is... just not in the way that you think. He's shown in the past by what he doesn't do that he doesn't make decisions based on micro sample sizes. It's just that his threshhold for accepting a "big enough to be relevant" sample size goes way down when the information is framed as a measure of who is somehow a "better man."
4 recs |
47 comments
| Add comment
Comments
Great stuff PC
And this virtus fallacy may get to the question a bit.
I think managers just feel like pitchers and hitters don’t change as they move along when it comes to what they throw. A batter who hits a guy a few years back hits that same guy now because that guy throws probably the same stuff and this hitter hits the same way. When evaluated in a broad sense, this would be ridiculous, but there is definitely something in managers that makes them think it’s not when it comes to pitcher-hitter matchups.
Marlin Maniac, a Florida Marlins blog
Come attend Intro to Sabermetrics 101!
Check me out at Beyond the Box Score as well.
by SFiercex4 on Oct 21, 2009 3:54 PM EDT reply actions 0 recs
Great Stuff
Tony La Russa lives off of these small sample sizes, at least according to what was written in “3 Nights in August”.
I have also wondered how much of the small sample size bias is from the media (player is 5-7 so far in the series), or the manager knowing more than we do (Arod’s hemroids are finally under control).
Jeff Zimmerman - Protecting the world from RBI's and Wins from my mom's guest house.
by Jeff Zimmerman (TucsonRoyal) on Oct 21, 2009 4:12 PM EDT reply actions 0 recs
Any Cardinal's fan should read "3 Nights in August".
Tony has some good ideas, the problem is they are based on small sample sizes instead of true/projected numbers.
Jeff Zimmerman - Protecting the world from RBI's and Wins from my mom's guest house.
by Jeff Zimmerman (TucsonRoyal) on Oct 22, 2009 10:13 AM EDT up reply actions 0 recs
He's great at producing hypotheses
just not so great at actually testing them systematically.
Linda's in the cold ground, won't see her anymore
Somewhere out on the highway tonight, the drunken engines roar
It's just one of those things, one of those things
-- Al Stewart, "Accident on 3rd St."
In memory of Nick Adenhart and all victims of drunk driving
by PaulThomas on Oct 22, 2009 3:56 PM EDT up reply actions 0 recs
I did enjoy that book
However, that was before I got into Sabermetrics. I shudder to think about how I would feel about it now.
by vivaelpujols on Oct 22, 2009 4:25 PM EDT up reply actions 0 recs
interesting post
there is certainly a culture of bravado that is unlike any other sport. the whole eye-for-an-eye, tit for tat when batters are hit, the steroids culture are other examples where this notion of virtus seems to permeate.
"Accept the mystery."
by Andronicus on Oct 21, 2009 6:17 PM EDT reply actions 0 recs
I had this very argument not too long ago on LSB.
After the Rockies/Phillies NLDS series. Someone told suggested that a smaller sample size against one pitcher was actually not a big deal because (I’m paraphrasing) the more specific set of data had lower variance..
by jwiscarson on Oct 21, 2009 9:27 PM EDT reply actions 0 recs
that was me
The discussion was civil and thought provoking, I enjoyed it.
After giving this some more thought, my question is this, say a particular batter has great success (in a sample of say 50 plate appearances) against a particular LHP, but struggles overall against LHP (hundreds of plate appearances). Which data is more reliable for predictive purposes, the smaller set of very specific data, or the much larger, but also much more generalized data set?
My thought was that segregating all the available data into LH/RH splits ignores all of the fine scale variables inherent in a particular individual matchup. The whole reason that we look at split stats is because we can take a subset of the entire data pool that is more predictive for a matchup, right? Well, if we can ignore the larger set of data because we feel that a batters success against a particular pitcher is more likely to be predicted by his success against the subset of pitchers that throw with a particular hand, rather than all pitchers, why is it ridiculous to pare the data down further into a subset that captures only the data related to a specific batter vs. pitcher matchup?
Isn’t it plausible that something about Joe Beimel’s delivery or assortment of pitches allows Ryan Howard to have a specific advantage against him compared to the entire population of LHP? Stated another way, wouldn’t the best data regarding Ryan Howard’s potential for success against Joe Beimel be a sample that only includes his results against pitchers that throw exactly like Joe Beimel?
What sample size would be required for a batter vs. pitcher matchup to be reliable?
"You can probably stick a fork in the Rangers' playoff chances for 2009." - AJM on 7/26 with the team 4.5 games out
by tricer on Oct 27, 2009 8:33 PM EDT up reply actions 0 recs
ABs
I think the last seven at bats are much more important that the last couple of at bats against the pitcher. Especially if you havent faced the pitcher in a while. If a hitter is on a hot streak it doesnt matter who is pitching….they are just seeing the ball and hitting the ball
by cnubsbl16 on Oct 22, 2009 10:21 AM EDT reply actions 0 recs
That assumes hot streaks exist and are more than just a product of random probability. So…are “hot streaks” real?
by Missing Barry on Oct 22, 2009 4:45 PM EDT up reply actions 0 recs
Yea
Yea hot streaks are real….ask any baseball player
by cnubsbl16 on Oct 22, 2009 7:10 PM EDT up reply actions 0 recs
The problem with this is...
generally speaking, baseball players aren’t statisticians. Baseball players are great athletes; this does not automatically make them great at understanding the math and physics behind the game.
by jwiscarson on Oct 22, 2009 7:28 PM EDT up reply actions 0 recs
Well i think most of them are smart enough to realize when they are seeing the ball well and when they are slumping
by cnubsbl16 on Oct 22, 2009 7:35 PM EDT up reply actions 0 recs
And they also have no idea when they will stop seein the ball well
by vivaelpujols on Oct 22, 2009 7:59 PM EDT up reply actions 0 recs
I never said they did…i just said they know when they are on a hot streak
by cnubsbl16 on Oct 22, 2009 8:02 PM EDT up reply actions 0 recs
Right, but just because a player has been on a hot streak
Doesn’t mean they will continue to be.
If a player goes 7-7, is he any more likely to get a hit in his next at bat than he would if he had gone 0-7? There is a lot of research that suggests that it really doesn’t make a difference
http://ussmariner.com/2007/08/20/projecting-future-performance/
by vivaelpujols on Oct 22, 2009 8:06 PM EDT up reply actions 0 recs
You guys are just thinking way too into this….If a guy is hot you play him….bottom line…..if a good Hitter is 0 for his last 10 you dont play him because he is due for a hit….baseball doesnt work that way…there is no formula to predict hitting
by cnubsbl16 on Oct 22, 2009 8:24 PM EDT up reply actions 0 recs
Honestly, basketball players say the same thing. I don’t know if anyone has done this research on baseball, but in basketball, the evidence points strongly to the fact that “hot streaks” don’t exist at the NBA level. It’s simply random chance. Each shot is basically independent of shots before and after it. I suspect baseball is the same way.
by Missing Barry on Oct 22, 2009 9:53 PM EDT up reply actions 0 recs
i dont know how you can say there is no such thing as a hot streak…i really dont understand
by cnubsbl16 on Oct 22, 2009 11:25 PM EDT up reply actions 0 recs
There is such a thing as a hot streak
The notion that those are predictive is unsupportable.
by vivaelpujols on Oct 23, 2009 12:34 AM EDT up reply actions 0 recs
Take coin flipping.
You’re going to flip five heads in a row occasionally. Is that a hot streak? Sure. Have you done anything different to cause the hot streak? Nope. Does the fact that you’re on a hot streak mean you’re more likely to continue flipping heads than if you were coming off a HTHT streak? Nope.
People have taken the sequences of makes and misses in basketball and compared the distribution of those results to what we’d expect if outcomes were random and players actually aren’t any more streaky than what we’d expect.
Streaks happen. They don’t mean a player’s doing anything differently or that the streak is going to continue.
Beyond the Boxscore Not a member? Sign up.
by Sky Kalkman on Oct 23, 2009 7:47 AM EDT up reply actions 0 recs
Flipping a coin and hitting a baseball are in no way comparable. You can control how well you hit the ball, you cant control the flip of a coin.
by cnubsbl16 on Oct 23, 2009 9:34 AM EDT up reply actions 0 recs
So why don't all players hit .1000 every year?
by vivaelpujols on Oct 23, 2009 10:01 AM EDT up reply actions 0 recs
Yes, if they could actually control how well they hit, they would all be allstars
The real truth is that each player has a certain mean performance (say a .330 batting average for Pujols), and they will fluctuate randomly around that mean over the course of a season. Players can’t control when they get hot or cold, they just do.
by vivaelpujols on Oct 23, 2009 5:55 PM EDT up reply actions 0 recs
They're pretty comparable for our purposes.
You can’t control the precise location of every ball in play. This is the whole idea behind FIP — the difference between a groundout and a base hit is essentially due to a hitter starting his swing milliseconds apart. Combine the random variance of muscle control in the hitter with the same effect in a pitcher and wind in the ballpark, and you have what might as well be a coin flip.
by jwiscarson on Oct 23, 2009 11:15 AM EDT up reply actions 0 recs
Question
If I gave you the data of nine ‘coins’ that had a 352/1000 chance of being heads (a hit) flipped 639 times each and then put in Ichiro Suzuki’s actual season do you think you could tell the real baseball player from the coin flip?
by Graham on Oct 23, 2009 11:16 AM EDT up reply actions 0 recs
The coin can't play defense, though.
Beyond the Boxscore Not a member? Sign up.
by Sky Kalkman on Oct 23, 2009 11:24 AM EDT up reply actions 0 recs
Would it really be worse than
Adam Dunn?
by stevesommer05 on Oct 23, 2009 11:27 AM EDT up reply actions 0 recs
Serious question – so assuming “hot streaks” are nothing but random chance…can we assume PA’s behave essentially as Bernoulli variable’s? That would be very informative on what constitutes a small sample size, and what samples are actually big enough for us to learn something from…
by Missing Barry on Oct 23, 2009 2:31 PM EDT up reply actions 0 recs
Yes but...
Not every PA is in the same environment (opposing pitcher, ballpark, base-out state, etc.) so the mean will be changing each time. Also although “hit or not hit” or “on base or not on base” are binary variables, the real outcome of a PA is not binary. That said, people do make use of the binomial distribution (the sum of independent Bernoulli trials) to model PAs. In creating wOBA, Tom Tango actually weighted the various results of a PA such that he could make use of the binomial distribution. I have no affiliation with him, but if this stuff interests you then you should read his book (aptly named “The Book”).
by mickeyg13 on Oct 26, 2009 11:11 AM EDT up reply actions 0 recs
Yeah I read some of that stuff, and I know where the assumptions are breaking down, I guess I’m just curious if it’s close enough to use as a rough approximation?
by Missing Barry on Oct 26, 2009 1:18 PM EDT up reply actions 0 recs
I think Tom Tango would like to have a word with you
by vivaelpujols on Oct 22, 2009 7:59 PM EDT up reply actions 0 recs
Actually, MGL would be more vocal.
Jeff Zimmerman - Protecting the world from RBI's and Wins from my mom's guest house.
by Jeff Zimmerman (TucsonRoyal) on Oct 22, 2009 9:22 PM EDT up reply actions 0 recs
I think it's less the virtus fallacy
And more a misapplication of conditional probability. That is, I think when a manager sees a guy who’s been 5-for-7 against a certain pitcher, it seems plausible that the probability of the guy getting a hit GIVEN that he’s facing that pitcher is higher, but if he’s just 5-for-7 his last seven at-bats, the idea that he has a higher probability of getting a hit given that he’s on a hot streak, or whatever, doesn’t seem as plausible, and is considered more likely to be random variation.
Clearly, both viewpoints are flawed, but I think that might be what’s going on. Another way of looking at it is, in the face of random variation, people always look for some sort of explanation, and the fact that a guy does well always against a specific pitcher is just too good for most people to pass up.
Brewers Baseball and other assorted nonsense (mostly the assorted nonsense) at my blog, What's a Tararrel?
by Lefti on Oct 22, 2009 11:15 AM EDT reply actions 0 recs
Another way of looking at it is, in the face of random variation, people always look for some sort of explanation
This has really started to bother me. I’m not sure why people feel the need to come up with an explanation for why every single little thing happened. Sometimes things just happen, for no other reason than random chance. There’s no explanation required. It seems to me trying to explain every little thing breeds more misinformation and bad knowledge than the possible good it might do…
On another note, I’d be inclined to say the 7 AB’s against the pitcher matter more. I’m assuming true “hot streaks” don’t exist in any meaningful way at the MLB level. While 7 AB’s against a pitcher don’t tell us enough to reach a conclusion, obviously, it at least raises the probability that the hitter somehow is a good matchup against the pitcher (righty-lefty, good fastball hitter vs. fastball pitcher or whatever else) a tiny bit…and a tiny bit is more than none, right?
by Missing Barry on Oct 22, 2009 4:50 PM EDT up reply actions 0 recs
Once in a while
Don’t expect it to be a regular thing. I’m “retired” after all.
by pizzacutter on Oct 25, 2009 11:38 PM EDT up reply actions 0 recs
5 words.
Mark Teixeira versus Bruce Chen.
by NoNameOnCard on Oct 22, 2009 9:49 PM EDT reply actions 0 recs
7 for 11.
1 2B, 6 HRs… .636/.667/2.364
by NoNameOnCard on Oct 22, 2009 9:50 PM EDT up reply actions 0 recs
Oddly...
Of the 4 outs made by Teixeira against Chen, 3 were strikeouts.
by NoNameOnCard on Oct 22, 2009 9:51 PM EDT up reply actions 0 recs
comparing a coin flip to an at bat is a gross oversimplification
There is literally nothing that differentiates coin flip 1 compared to flip 2 or flip 999 or whatever. The data set of all coin flips is uniform. I do believe that there are substantial differences between a players at bat vs Johan Santana compared to an at bat vs Joe Beimel. The data set of a batters appearances versus all LHP is not uniform.
Favorite bumper sticker of all time, seen on a VW bus:
"Gas, Grass, or Ass. Nobody rides for free"
by tricer on Oct 28, 2009 12:02 AM EDT reply actions 0 recs

by 









BtB on Facebook
















