For the well-hit ball theory to have any meaning, it needs to be more closely tied to results. At this point, the metric is far too subjective and vague to be truly useful. To change that, it is important to find a way to give WHB value in the context of all in-game events. What is meant by this is the following: if a double/home run/walk is worth a certain number of runs, then what is a WHB worth? Giving WHB meaning by giving it an expected run value would make it easier to tie into the context of a game.
This is where things get a bit complicated and tricky. To start, the entire basis of this theory is the underlying assumption that a well-hit ball should go for a hit. If it doesn't, then a hitter has likely experienced the misfortune of hitting the ball to a conveniently or well-placed defense. Therefore, WHB could be treated as hits, because the expectation is that WHB go for hits. If WHB are hits, then it is possible to determine the value of a general hit by taking into account the value of each independent hit type.
The aim of linear weights is to properly scale results (please, if you haven't done so, read these links). Values are attributed to singles, doubles, triples, etc. These are the run values that are expected to be produced given an average run environment. Here is a snippet explaining how linear weights can be used to find the value of a WHB:
Tom Tango's blog helped provide the basis for this part of the theory. In the population of 115 players, the individual type of hits that occurred on WHB were totaled. Those totals were then multiplied by the expected number of runs on each individual play (using the chart below), which produced the total number of runs coming from each event. These were then added to find the overall total of value produced on WHB. Lastly, this total was divided by the number of WHB to find the expected value ofa WHB, which came out to 1.07 runs. here is a table comparing this value to other linear weights at the time this research was performed (for the 2011 season):
The value of a WHB turns out to be just slightly more valuable than a double. With the value of a WHB in hand, it could then be applied to WHOBP. This produced a metric structurally similar to wOBA. With linear weights applied, this became the new formula for calculating a player's WHOBP, known as Weighted Offensive Output (wOO+):
(1.07 * WHB)+ (.589 * BB) + (-.013 * K)
AB + BB
Before going anywhere with wOO+, it needed to be tested for consistency. Regression was run and the results are in the following table:
This consistency is nice, but the difference in wOO+ and OBP+ for certain players was much larger than it was for other players. The common denominator in this was speed. Fast players have a unique ability to get on base in opportunities where slower players do not. To account for speed, players were classified much like hurricanes and tornadoes. The faster the (wind) speed, the more severe the event. Players were classified based on a combination of timed events, SB numbers, and GB/FB ratio. The slowest players (like Adam Dunn) were placed as "Category 1" players. The fastest players (like Ben Revere) would be classified as "Catgory 5" players. As mentioned previously, players were not simply categorized on speed alone, because raw speed does not always impact a player's game on the field. For example, Ichiro Suzuki and Andrew McCutchen use their speed to different levels. Ichiro slaps the ball on the ground, whereas McCutchen features more power in his game. This analysis led to this table (wOO+ is scaled to average):
What this revealed was that the fastest players saw a major difference between their wOO+ and their actual results, partly because wOO+ was not accounting for speed. The next step became adding the average difference between wOO+ and OBP+ to players based on the categories they were placed in. This metric is referred to as the "Speed Factor" (SF). When added to wOO+, it helps create the Standardized Metric of Aggregate Speed and Hitting (SMASH). The calculation for SMASH is as follows:
((wOO/lgwOO)*100) + SF
Now, speed is not the only reason for the difference between wOO+ and OBP+. However, it was the last important offensive skill that was not being accounted for. Even at this point, SMASH was not enough to explain the results that were being found. Luck was still not accounted for in any way.
Going back to the principal that WHB should go for hits, it's possible to develop a measure of luck. If it is assumed that every WHB that goes for an out is out of a player's control (a product of defensive placement and inconvenient timing), then a player is rather unlucky if more of his WHB go for outs than the league average rate. This produced the "Luck Factor" (LF). The calculation for LF is as follows (WHO= Well-Hit Outs):
Finding the relative amount of times a player has his WHB go for outs compared to the league average provides a way to scale the results. Theoretically, this should be a good way to measure luck. However, it is still possible that certain types of offensive players may be more or less prone to their WHB going for outs. Teams use defensive shifting to various degrees, and some players playing certain schedules may see more shifts that impact their results than others. To test, 2010 LF was compared to 2011 LF.
No evidence was found to support that certain players are more prone to a high LF than others. LF can now be used along with SMASH to help explain why certain players are not producing certain expected results.
To shed more light on the WHB theory, here are a few specific examples.
Example One: Adam Dunn
Adam Dunn (2011): .159 BA/ .292 OBP/ .277 SLG/ .240 BABIP/ .268 wOBA/ 9.6% HR/FB
Adam Dunn (career): .238 BA/ .368 OBP/ .497 SLG/ .286 BABIP/ .371 wOBA/ 22.0% HR/FB
Two things that stand out here are Dunn's drop in BABIP and his significant drop in HR/FB ratio. Several clues can help explain why this happened for Dunn, including his strikeout rate jumping to nearly 36%. WHB theory helps to further explain what happened with Dunn:
Adam Dunn (WHB data 2011): 87 SMASH/ 137 LF (an LF above 100 signifies being "unlucky")
Dunn's numbers suggest that he was bad, but they also suggest that he was not as awful as his 50 wRC+ or 54 OPS+ suggest. Secondly, his high LF means a lot of misfortune (mostly from having a large number of WHB fly balls go for outs). When his WHB fly balls started turning into outs instead of extra-base hits, both his BABIP and HR/FB rate dropped. Using Dunn's SMASH and his career trends in BB% and K%, his 2012 numbers were predicted. It was predicted that his OBP would be .405. His OBP at the end of the season was .333, with his strikeout rate failing to drop back to the norm. This projection provided by SMASH was incredibly favorable to Dunn, but it's possible that he was again unlucky (though with a 29.3% HR/FB ratio, it's more likely that this misfortune would come from hitting WHB into a defensive shift).
Example Two: Andrew McCutchen
Andrew McCutchen is a good example of why speed is important. Despite being one of the fastest players in the league, McCutchen was only classified as a Category 4 player for speed. Here are his SMASH and LF numbers from 2010 and 2011:
2010: 120 SMASH/110 LF
2011: 125 SMASH/119 LF
Despite getting better as a player, McCutchen was significantly unlucky in both years. With this in mind, a projection for McCutchen's performance was made (for credibility's sake, here is a snippet from the actual research paper). Categories are name, age, BA, ISO, OBP, SLG, and OPS:
McCutchen's actual 2012 performance: .327 BA/ .400 OBP/ .553 SLG/ .943 OPS
McCutchen's .375 BABIP and 19.4 HR/FB ratio were higher than WHB could ever possibly anticipate or suggest, but it would certainly appear that McCutchen was able to avoid defenses and see his WHB go for more hits. His repeated ability to hit WHB and use speed within his game helped him reach the breakout campaign that his skills were suggesting he would have.
Example Three: Mike Napoli
Napoli slides into the third example because he turned out to be an example of what WHB could reveal in terms of players being more lucky than others. Here are Napoli's 2011 numbers and his SMASH numbers:
Napolit (2011): .320 BABIP/ .414 OBP/ .631 SLG/ .344 BABIP/ .445 wOB/ 25.5% HR/FB
Napoli (2011 WHB): 102 SMASH/ 94 LF
Napoli is an interesting specimen. He was luckier than average, but it was nothing egregious (6 points is barely anything, and maybe totals a couple of WHB that don't go for outs in a season). It simply turned out more of his WHB fly balls went for home runs than before. However, his SMASH total should theoretically not produce a 173 OPS+/177 wRC+. Here is a snippet of his prediction line:
Napoli (2012): .227 BA/ .343 OBP/ .469 SLG/ .812 OPS
In Napoli's defense, his BABIP in 2012 was only .273. Based on this information, it is assumed that his good luck ran out and that his skills remained unchanged. With access to the proprietary data, one would be able to test whether or not this was true. However, using Napoli's recorded true skills instead of his results, it was possible to make a fairly accurate prediction for OBP.
WHB theory certainly has its limitations. It doesn't factor in game situation and assumes that every at-bat is approached with the goal of hitting the ball hard (causing events like sacrifices, bunts, and other events to cause differences in the results and predictions...which can be adjusted). SMASH does show, however, how a player's contact ability, power, patience, discipline, and speed can influence his results on the field.
Using LF, it is also possible to gauge of defensive shirts and luck are influencing results. The next step for WHB is to use raw batted ball speed and angle to classify them, which gives them more objectivity. Certain balls in play should be expected to go for hits, and they are a tangible result that reflects individual skill very well. In the last installment, I will go over more examples, different potential applications for WHB, and provide video of what a WHB looks like.
Credit to Inside Edge, Baseball Reference, Fangraphs, Bill James, Tom Tango, Marcel, Rob Neyer, and countless other individuals for the data and support that went into this research and this writing.