clock menu more-arrow no yes mobile

Filed under:

Changing the Way We Look at Offense, Part Four: Balls In Play

The last step before attempting to create a metric using WHB is to understand the impact of the Ball-In-Play data that we already have and use. WHB would serve as an improvement over this data, theoretically, if it has a stronger correlation to success. Sneak peek: this is the last installment before the creation of my metric "SMASH."

WHB analysis helps reveal more about why Matt Cain has been so good.
WHB analysis helps reveal more about why Matt Cain has been so good.
Jason O. Watson

WHB data has been shown to be linked to positive offensive success and negative pitcher outcomes. At the core of WHB data is the fact that none of trajectory, velocity, or on-bat contact can produce a WHB on their own. In fact, it's often the case that two of the three isn't good enough. To further explain this, this installment is going to be all about Ball-In-Play (BIP) data. When I refer to BIP data, I am talking strictly about the type of ball put into play as recorded on sites like Baseball Reference and FanGraphs.

The distribution of GB, FB, and LD is often something that sabermetricians look at when analyzing offensive players and pitchers. This study has already been challenging key analytical concepts, so I wanted to keep going. I will test the very fundamentals of BIP data and see how they relate to WHB and overall success for both offensive players and pitchers.

Section One: FB% for Offensive Players

The first thing I thought of when thinking about WHB and how they relate to BIP data was whether hitting fly balls was a skill that would lead to more WHB. Since WHB could already have a relationship with power and overall offensive success, I wanted to see how much fly balls played into that. A table of relationships is below:


The results here are very interesting. FB% has a relationship with Isolated Power (ISO) and HR Rate, but not with WHOBP, OPS, or Hit Rate. What I believe this suggests is that fly balls are a positive outcome when they are hit well enough to go for extra base hits.

With these relationships, I used raw data and not park-adjusted data because I wanted the ballpark biases to come into play. If fly balls are not going for extra base hits, they are typically easy to field. Overall offensive success does not seem to be dependent on the ability of a hitter to hit a large percentage of fly balls. This came as a surprise to me, because I had anticipated that players with a large FB% would see an increase in overall numbers since their power frequency had increased.

This doesn't suggest that FB% is a negative thing when it is high, either. Fly balls can be lost in the sun, they can lead to extra base hits, they can fall into play if the defender doesn't get a good break, and they can lead to sacrifice flies (or base runner advancement) if the situation allows. All this is saying is that overall success for an offensive player is not necessarily dependent on his ability to produce FB.

Section Two: LD% for Offensive Players

Something I often read a lot about when browsing the internet looking for baseball posts is that people love line-drive percentage. It is believed by some that LD% leads to a high BABIP and that hitters who are producing a a high LD% are more likely to have better offensive outcomes. In a sense, LD% didn't fare well when compared to the same five core offensive metrics that I used in the above table. Here is the LD% table:


The results here come as a big surprise to me. If you're wondering where this data comes from, the BIP data is from FanGraphs and the other metrics are created or pulled on September 21st, 2011. As you can probably see, LD% didn't have a strong showing, although it did have a notable relationship with Hit Rate (.34 R). On a whole, LD% does not appear to be a huge part of offensive success.

Part of this is due to the fact that not all LD are created equal. There is a difference between a sharp liner down the line and a soft liner to second base. This isn't to say that LD% isn't important, because it still very well could be. After all, LD% help build Hit Rate, which means getting on base. What this does suggest, however, is that LD% and BIP data as a whole may not be entirely vital in predicting offensive success. As my WHB theory suggests, it is the type of ground ball, line drive, or fly ball that is hit that matters.

Section Three: GB% for Offensive Players

Ground balls are a really big key in understanding a lot about offensive players. More so than with fly balls and line drives, speed is a factor in how effective ground balls will be. Ground balls lead to quite a few hits, but they also make it nearly impossible for home runs to be recorded and often produce double plays. I'm not kidding when I say GB% is hugely important, and here is a table showing why:


The results here are much like those for fly balls, but I believe they are even more powerful. Let's start with FB% vs. ISO. This is a negative relationship with an R value of .54 on a negative sloping line. Ground balls are a good indicator that a hitter isn't going to be hitting for a good ISO, which means his power numbers are probably struggling. This is due to the difficulty involved in getting extra base hits when ground balls occur. Without elite speed or hitting the ball right down the line, the chances of a ground ball going for a single, error, or out are very high.

This can also be seen in HR Rate, where the general trend is that a high GB% means hitting home runs less frequently on a whole. What is most intriguing and important, however, is the relationship between GB% and WHOBP. A .38 R was found when relating the two, which suggests that hitting ground balls has a negative overall impact on a hitter's ability to produce overall.

Due to home runs simply not occurring on ground balls, this isn't surprising, considering no event produces more value on average than a home run. As for the fifth relationship on the table, there simply isn't a relationship between the ability to get hits and the frequency by which a player is hitting ground balls. It is my belief that this is due to speed being a huge indicator of whether or not players can reach base on balls hit on the infield (which grounders often are). The faster the player, the more likely he is to be able to reach base in these situations.

Section Four: BIP Types for Pitchers

With pitchers, BIP analysis becomes a bit different. Pitchers come with labels that often speak to their ability to pitch to a certain BIP type. What this creates is an interesting dynamic where BIP types don't tell much about general tendencies for starting pitchers. Here are the tables with all of the relationships:

Fb_25pitchers_png Ld_25pitchers_png Gb_25pitchers_png

None of the tests here produced anything. There simply doesn't seem to be a relationship between BIP types and the ability to prevent runs. Pitching styles simply differ too much from pitcher to pitcher. There are bad fly ball pitchers and good fly ball pitchers, just like there are good ground ball pitchers and bad ground ball pitchers. Therefore, no relationships are found.

What would be worthwhile here is to discuss specific pitchers. In this case, let's take a look at a fairly confusing and controversial pitcher (statistically): Matt Cain. Cain has the label of being a fly ball pitcher, which is entirely deserved (since 2009, among pitchers with at least 800 IP, only Jered Weaver has given up more fly balls). However, Cain doesn't just get discussed because he's a fly ball pitcher. He gets discussed because he's a fly ball pitcher pitching in one of the best parks for his pitcher type. Here are some notables about AT&T Park:

Dimensions (in feet):

Left field line- 339, left field-364, Left-center field-404, Center field, 399, Right-center field-421, Right field- 365 (24-foot high wall), Right-field line- 309 (also 24 foot high wall)

Park Notes:

Dead last in ESPN HR park factor year-to-year (tab through the years).

Very pitcher-friendly by Baseball-Reference park factors.

All of this is true, yet Matt Cain's splits are not drastic and there is a difference of about a half a run in his ERA. There's a .061 difference in OPS from home to away performances (mainly driven by a .097 HR/IP rate on the road versus .076 allowed at home), and a big part of this is because of his home ballpark. When he goes on the road, Cain is taking multiple trips a year to Coors Field and Chase Field every year, (where he has given up 17 homers combined in 27 starts, or .101 HR/IP).

The reason the differences are not that drastic (and that Cain has a 3.63 career ERA on the road) is the fact that he avoids WHB on his fly falls. He sees a lot of WHB go for outs because of his home ballpark,but he also sees that happen on the road. His playing style plays well to his home park, and it's entirely possible that he's pitching that way intentionally (though further work would have to be done to make that conclusion).

Essentially, with Cain, park factors and stats like xFIP really don't tell enough about his pitching results. When you look at WHB splits, Cain gives up quite a few WHB that are singles and doubles. After all, while it may be hard to give up home runs in AT&T park, the outfield is very large, so the chances that balls go for doubles, triples, and singles is increased by the fact that there is much more ground for outfielders to cover.

While these are obviously negative outcomes, they are counteracted by his increased frequency of WHB that go for outs on a whole. Essentially, any WHB that is a fly ball is more likely to get sucked up by the vacuum that is AT&T's outfield, but any LD that is a WHB is more likely to find a gap or hole in the outfield, because there's more ground that needs to be covered.

This is one of the best parts of WHB analysis. Most of Cain's WHB end up as singles and doubles rather than triples and home runs, which is actually a product of his pitching style. This is what allows him to have a strong road ERA, because a majority of his FB are not WHB, which means easy outs.

When he goes on the road, his WHB that go for homers increase, but the number that go for singles and doubles decreases, which alleviates the additional damage done by the home runs. Rather than punishing him for his ballpark and his tendency to give up fly balls, it's actually almost the opposite based on what I've found. The size of the outfield in AT&T actually forces outfielders to play further back (an assumption on my part).

He pitches well in all parks because he gets a lot of weakly hit fly balls, which makes his overall road results better than most pitchers (for example, Clayton Kershaw has a career ERA almost a full run better in Dodger Stadium than on the road). If you were to look simply at BIP splits, they might suggest that Cain is doing a bad thing by giving up so many FB, but I'll say it again: the quality of the FB/GB/LD matters much, much more than the frequency by which they are given up.


On a whole, BIP data doesn't reveal much in terms of relating to overall success. In certain cases, like ground balls for offensive players, seeing BIP types can help in determining how a player will perform. In most cases, like with pitchers, the data doesn't reveal much due to the differences in pitching styles. What this provides is a further suggestion that the WHB theory is strong for analysis. As seen in the Matt Cain example, the quality of the FB/GB/LD matters more than the FB/GB/LD itself. Feeling more confident in this, I decided to take WHB to the next level, which meant severely adjusting WHOBP. In my next post, I will get down into the nitty gritty of what WHB actually do for offense, which means developing a new metric: SMASH.


Credit to Inside Edge, Fangraphs, and Baseball Reference for the data.