clock menu more-arrow no yes mobile

Filed under:

Statistical Head Scratchers: The Hit Tool

The other day I stumbled upon a great piece by Baseball Prospectus' Ben Lindbergh entitled "Tooling Around" where Ben takes a look at today's five tool players using a combination of old school and proprietary statistics.

There, for the first time, I noticed that PECOTA isolated power from speed by counting both doubles equally in its isolated slugging formula. It suffices to say, I was in love with Ben's piece. Additionally, his use of FRAA and BRR should be commended as intriguing solutions to complex questions. But, then I got to his use of batting average for the "hit tool." Intuitively, if you ask average baseball fans the use of batting average to judge a player's hit tool makes sense. And at first, I too didn't think anything of it.

But, hitting for a high batting average is mostly dependent on four variables. Speed, power, consistent quality contact, and of course the opposing defense. Two of those variables that directly effect batting average are other tools (speed and power). While another major variable, defense, is entirely out of the hitter's control.

Additionally, let us not forget that Pizza Cutter showed us that batting average doesn't correlate well from year-to-year, nor does it stabilize within a season's worth of plate appearances. Likely, stabilization takes around a thousand plate appearances. If a result isn't repeatable, then it likely indicative of one's tools.

So, over the past week I set out isolate the hit tool statistically, just as PECOTA did with ISO and power.

Well, JD, what about consistent quality contact? Did you just make that term up? Yes. Yes I did. But, I'll get back to that in a moment.

So, to back track for a moment, I asked experts how they defined the hit tool

Rene Saggiadi (European Talent Evaluator): It's simply the ability to square balls up.
Jason Parks (Baseball Prospetus, out of context quote): a "smooth swing and excellent barrel awareness that should allow [one] to hit over .300
Jim Callis (Baseball America, Interview): Someone's pure hitting ability
Jeff Reese (Bullpen Banter): The hit tool is evaluating the aspects that are conducive to high batting averages.
Additionally, Kevin Goldstein, Ben Badler, and Jim Callis noted that MLB regulars with 80 hit tools included Albert Pujols, Ichrio, and Joe Mauer.

Does this tell us anything?

Clearly, the defining the hit tool isn't as simple as equating it to hitting for a high average. In fact, there is a lot of ambiguity in these answers. However, each has an element worthy of highlighting. To me a few terms stand out: the pure ...ability to square balls up... with excellent barrel awareness. Or,

Consistent. Quality. Contact.

Purity is important. Though, the term can be interpreted that to mean several things, I read it as an onus to remove the other variables. Rene Saggiadi suggested to remove speed that we normalize one's infield hits. It is a start, but we're still faced with defense which is a big issue, and the lesser issue of power.

The key to removing defense is looking at statistics that are occur prior the true outcome of the ball in play. For this, I suggest looking at contact rate. Normally, contact rate is calculated with this formula (AB-K/AB) , but I add in the sacrifice fly because I loathe them. To be clear, this is different than Fangraph's contact percentage, which is based on pitch f/x data.

Three more great things about contact rate are that it has no correlation with OPS (meaning, it's independent from the power tool), it stabilizes quickly and it correlates highly from year to year, as a tool (or a refined tool also know as a skill) should. If I've learned anything from law school, it is that the answer to a question will always depend on how you've framed it. Here, I've framed the hit tool to mean "consistent quality contact". And unfortunately, contract rate is missing the quality component, it appears to be fatally flawed using this frame work.

At least, if we looked to it alone.

Hopefully I don't get an angry e-mail from Colin Wyers, but why not use line drive percentage to fulfill our quality element while contact rate will still fulfill our contact requirement? Theoretically, that may be a good idea, but because of Colin's crusade we know that batted ball data is questionable at best. Additionally, the flawed data we do have tells us that it doesn't correlate highly from year to year. Which, if you agree with my prior premise that a high year-to-year correlation is a prime indicator of a skill, you cannot agree that this data is helpful to determine one's skills.

While it may not be impossible, it is clearly not only difficult to define the hit tool, but also to isolate it statistically. At least in theory, I think a combination of both line drive percentage and contact rate best exemplify the underpinnings behind the concept our experts articulated above.

JD Sussman is full time law student and co-founder of Bullpen BanterHe can be reached at or via twitter.