Mark Bellhorn
salb918 is going to be a new writer on this site. I'll leave it up to him to reveal his name if he wishes. Blez is out of town and can't admin his account yet, so I'll post this story up for him.
There was an interesting post at Over the Monster a couple of weeks ago analyzing what would happen if Mark Bellhorn could cut down on his K's. Besides denying the Fenway faithful a cool breeze, Brian theorized that if Bellhorn could cut his strikeout rate from 36% to 25% (on at-bat basis), he would put extra balls in play, some of which would fall in for hits based on a .300 BABIP. As a first approximation, Bellhorn would raise his 2005 line from 221/335/365 to a more respectable 250/365/400 - not bad for a middle infielder.
A comment by matty fred guessed that Bellhorn's famously discriminating eye causes him to swing only at those pitches that he knows are in his "hitting zone." This suggestion, if true, would mean that patient players ought to, in general, have higher BABIPs if they wait for the pitches they can drive with authority. This seems especially true for Bellhorn, given his unusally high BABIP the last two years (somewhere in the vicinity of .350, with the major league average being between .290-.300).
Ah, but is it true?
Let's find out: I generated a list of all players between 1974-present with greater than 500 PAs thanks to the incredible searchable database at Baseball Musings. I then removed all the pitchers (not hard - just look for the craptastic batting lines). I did not account for park, league, or era effects. I used the 500 PA threshold so that I could get meaningful sample sizes from players without discriminating against the marginal players, who, after all, make up a great deal of baseball's historical statistics.
The hypothesis is that more discriminating hitters would have a higher BABIP. BABIP is easy enough to measure:
(H - HR)/(PA - SO - BB - CI - SF - SH - HR)
Determining the "patience" of a hitter is another matter. Some patient hitters walk often, but others tend to strike out owing to their tendency to work deep into counts. My rough (very rough) Patience Index is simply the percentage of PAs that result in a base on balls or strikeout. This covers guys like Rob Deer and Adam Dunn while still giving fair shakes to a patient-but-tough-to-strike-out Gene Tenace. One day, when I have more time and bigger brains, it might be interesting to generate a "patience quotient." I don't think that's really necessary in this situation.
The result? Patience, at least as I measured it, is not correlated with a high BABIP. Just because guys wait for their (or any) pitch, it doesn't mean they are turning them into hits any more often. In fact, the players with the highest BABIPs in the DH era include hackers like Ichiro! and Derek Jeter as well as patient hitters like Manny Ramirez and Edgar Martinez. Notice that these players made names for themselves by hitting a lot of line drives - a trend tracked by the good people at Hardball Times.
Although patient hitters may not be getting hits at a higher rate, they sure hit the ball hard. Check this out:
The percentage of hits that fall for extra bases and isolated power are correlated with patience. This isn't a one-to-one correlation, of course, suggesting that other factors are involved. Still, there's enough of an upward trend here to suggest statistically significant correlation. And we all know that correlation does not equal causation, the fact that the decision to wait for a pitch is made before it is put into play strongly suggests that there may be something to being patient: it is the origin of the extra-base hit.
Mark Bellhorn is the player he is, and strikeouts are part of the package. Even if he were to cut his strikeouts, the data here suggest that it would not necessarily translate into greater production. In fact, his unusal power (for a middle infielder) may be a function of his willingness to wait for a pitch to drive. Philosophically, the roots of his approach are not hard to trace. He came from an organization (Oakland) that encouraged young hitters to be patient, even if it meant striking out. Because he could not distinguish himself with his glove or batting average, he distinguished himself with his power and patience. If you can live with the stone hands and the strikeouts, he would make the perfect bottom of the batting order hitter for an otherwise strong team in the AL. Hey, look: Theo Epstein already figured that out.
16 comments
|
0 recs |
Do you like this story?
Comments
two questions and a comment
It's been a long time since I took a statistics class and at this point in my life, I'm not so good with them anymore. What is the R^2 number and how is it calculated? I assume that the higher the R^2, the more the correlation between the two numbers?
And the second question...when I try to use the data I'll copy and paste and everything will go into spot A1 on the spread sheet. How do I make it not do that? I actually spent a few days straight entering about 600 different seasons into a spreadsheet by hand.
by brian38 @ Beyond the Box Score on Jul 17, 2005 10:05 AM EDT reply actions
I think I can help with one
by Marc Normandin on Jul 17, 2005 10:10 AM EDT up reply actions
::smacks forehead::
It's not so much that I'm a trooper...it's mostly that I just have nothing better to do!
by brian38 @ Beyond the Box Score on Jul 17, 2005 12:04 PM EDT up reply actions
excel
by brian38 @ Beyond the Box Score on Jul 17, 2005 10:55 PM EDT up reply actions
Another way
by Dan Scotto @ Beyond the Box Score on Jul 17, 2005 11:16 PM EDT up reply actions
Th R^2 number
In science, you look for R^2 = 0.99. In these types of studies, where there are too many variables to control, you look for any moderately large R^2. Since R^2 of fig 1 is close to 0, I said there was no correlation between "patience" and BABIP. What I could have done was report a p-value, which tells you if there is a statistical significance to the correlation; I would guess that with so many data points, the correlation is significant.
Here is the best way to calculate R^2: In Excel, right-click on your data points in the plot, go to "add trendline," and use the dialog box. In reality, what Excel is calculating is the covariance of the data sets, which if you are a geek like me, can be read about here. It then takes the covariance and "normalizes" it, in a manner of speaking, so that you have an output R^2 between 0 and 1.
Sometimes I really wish I hadn't slept through so much of this class...
r^2
I agree with you on the class thing though. The beauty of community college was that I was the smartest person, by far, in every one of my classes. I'm not trying to say how great I am, I'm just saying that I went to school with a bunch of idiots. I was able to breeze through every class. The problem with that is that if I bothered to show up, I would just go to sleep. I didn't actually learn anything because I didn't actually have to learn anything in order to pass the class.
by brian38 @ Beyond the Box Score on Jul 17, 2005 12:24 PM EDT up reply actions
So, I guessed partially correctly, right?
Excellent article!
I have this really cool old LP w/booklet of Stan Musial teaching you how to hit. One of his big credos (though probably not original to him) is to "wait for your pitch in your zone and hit it." Kind of cool to see some empirical evidence lending some support to this approach as an effective hitting strategy.
Way to get the conversation rolling, salb918!
ps-Speaking of Over the Monster, how 'bout Wakefield getting a complete game loss? There's something you don't see every day.
by matty fred @ Beyond the Box Score on Jul 18, 2005 12:10 AM EDT reply actions
Really...
Hey
by Marc Normandin on Jul 18, 2005 7:07 AM EDT up reply actions
I'll take that as a compliment
From Firebrand
Um...
by Richard Wade on Jul 18, 2005 7:19 PM EDT up reply actions

by 


























