clock menu more-arrow no yes

Filed under:

Toward a Pitch-Based Metric, Part I: Balls and Strikes

New, 4 comments

The sabermetrics of pitching is built on layers. Originally, pitcher wins were thought to be of sufficient quality to successfully capture the essence of a hurler's performance. As time passed, earned run average came into favor as a more effective descriptor of the quality of a pitcher's efforts. With the advent of the defense-independent pitching era ushered in by Voros McCracken over a decade ago, we have seen all sorts of new ERA-scale metrics bandied about, as well as win-based systems such as the various WARs. Leverage-based metrics also have their proponents. And that's not to mention the popularity of citing the components of all of those metrics-strikeout rate, walk rate, home run rate, homer to flyball ratio, groundball rate, BABIP, strand rate, and more are used frequently in pitching analysis.

If you are at all familiar with my past writing, I would hope it goes without saying that I champion all of the aforementioned advancements, and do not hesitate to turn to them to assist me in gleaning valuable information about a pitcher. However, there is something unsatisfying about nearly all of those admittedly useful numbers-they lack a connection to pitching itself.

Imagine for a moment that you have been hired as the pitching coach of the Cleveland Indians (Yeah, suspend your disbelief). They want you to fix Ubaldo Jimenez, who is having an awful season. But what sort of advice can you give Ubaldo using the metrics discussed in the first paragraph? Here are some samples:

"You know, your FIP is 5.28 and your xFIP is 5.24, so you've actually been a bit lucky."

"You really need to cut down on the walks-you're walking 5.36 batters per nine."

"Your groundball rate is too low, at 39.1%. That's leading to too many homers, 1.31 per nine. Get more grounders."

Obviously, none of these words of wisdom would provide Ubaldo Jimenez with much assistance in how to turn his season around. What they are missing, of course, is the how. It's one thing to say "Player X's walk rate is too high." It's quite another to say why it's too high. The majority of the oft-cited metrics miss that, because they usually involve the results of at-bats rather than the process underlying them.

I'm not necessarily going to dig all the way to the bottom today, but I will take things a level deeper.

What is a level deeper? Statistics that are influenced by each pitch, not just at-bats. This includes stats like velocity and pitch distribution, as well as the numbers that appear under FanGraphs' "Plate Discipline" tab, such as swinging strike percentage, first-pitch strike percentage, and chase rate.

In addition to having more of a connection to the actual task of pitching, these numbers also have the advantage of coming in larger samples. A starting pitcher might only throw 150 innings in a season, or face 700 batters, but throw 2500 pitches, so pitch-derived numbers can't be as "fluky" in a season-long sample.

What do I intend to do with these statistics in this piece, you ask? I am going to examine some correlations between them and SIERA, which many regard to be the best ERA-style metric out there. What pitch-based factors correlate with this overall performance metric, and can we build a way of determining overall performance based on pitch-based factors alone? The total discussion is a bit too extensive for just one piece, so I'm splitting it up. Today, I'm going to look at factors relating to balls and strikes.

To begin to examine this issue, I went over to FanGraphs and made myself a spreadsheet of all starting pitching seasons of 100+ innings from 2002-2011, excluding knuckleballers. That's 1300 seasons, so it's a very large sample, and any correlations that show up are certain to be quite legitimate.

Let's start with a really basic one: strike rate, the percentage of pitches thrown that are strikes. How does this correlate with SIERA?


The r2 number here is .2954, which essentially means that the ability to throw strikes accounts for about 29.54% of SIERA. If you're familiar with the construction of most defense-independent metrics, that shouldn't surprise you. FIP is based off of strikeouts, walks, and homers. xFIP and SIERA are based off of strikeouts, walks, and ground balls. So walks are about a third of DIPS metrics, and the ability to throw strikes is going to account for much of walk prevention, as shown here:


So that makes sense. But, of course, a .2954 r2 is much too low to just stop there. There is, of course, much more to pitching than simply throwing strikes, or else Kevin Slowey would have won several Cy Youngs by now. The highest strike rate in the past ten years was put up by Carlos Silva in 2005, at 71.4%, but he had a middling 4.25 SIERA that season.

What was Silva missing that year? The ability to miss bats--while he poured the ball in the zone, he only induced swings and misses 4.5% of the time-about one in every sixteen strikes. That rate was tied for ninth-worst in the past decade.

Just like strike rate correlates very well with walk rate, swinging strike rate is a great predictor of strikeout ability:


It's not totally airtight--Vance Worley struck out 21.22% of batters last year while getting just a 5.4% SwStr, but it's easy to see how it is quite difficult to post an excellent K% without having the ability to miss bats on at least a semi-regular basis. None of the 64 25%+ K seasons, for example, had a below average SwStr%.

Swinging strike rate has an advantage over strike rate, as well--there's no such thing as a bad swinging strike, whereas there is certainly such a thing as a bad strike. A home run is a strike, after all. Therefore, it should again make plenty of sense that SwStr% has a higher correlation with SIERA than strike rate:


Now we're up to 43.83% of SIERA, as the r2 attests. Note also the slope here of -21. That means that a 1% increase in swinging strike rate should lead to a decrease of .21 in SIERA. With strike rate, not only was the r2 significantly lower, the slope also was lower, at -13.63.

So, this strong relationship certainly justifies why SwStr% is thrown around so much-a pitcher with, say, an 11% SwStr can be expected to post a SIERA over a full run lower than one with a 6% SwStr, and the correlation between the two is quite high. But again, bat-missing ability isn't all there is to pitching.

There's another type of good strike, though--called strikes. Like swinging strikes, there is no such thing as a bad called strike. Yet one almost never sees called strike rate cited in anything. I was curious to see if called strike rate had any correlation with SIERA.


A .0412 r2 borders on randomness, though the slope of -8 suggests that these two variables have a very slight correlation. Obviously, the effect of called strikes on SIERA is far lower than that of swinging strikes.

Honestly, that surprised me. I certainly wasn't expecting CallStr% to have a r2 as high as SwStr%, but I was expecting more of a trend than this, with an r2 somewhere around .15-.2.

The next thing I tried was adding CallStr% and SwStr% together, to get what I call "Good Strike Percentage." I wasn't sure what this would do to SIERA-would the presence of called strikes, which have little effect on SIERA, dilute the swinging strikes' effect? Would the r2 be near the sum of the r2s of called strikes and swinging strikes? I got this:


Oh. Well, now we have something.

While called strikes apparently have little effect on SIERA on their own, it seems that factoring them in along with swinging strikes significantly increases the r2. Given that I expected it to have a .15ish r2 in the first place, it adding .15 to the SwStr r2 makes a lot of sense.

Here's the thing, though-we know that a swinging strike is more valuable than a called strike, as the correlations have attested. Simply adding them together, then, wouldn't be optimized, the same way that OPS overvalues slugging compared to OBP.

It turns out that one can generate the best correlation with SIERA by treating a swinging strike as about 1.5 times as valuable as a called strike. Here's the graph of (1.5SwStr%+CallStr%) and SIERA:


Finally, the other readily-available and readily-cited metric relating to balls and strikes is first-pitch strike percentage. Here's the graph of FS% and SIERA:


Again, this is expected-first-pitch strikes are certainly a good thing, but they're far from the most important component of pitching.

Can we incorporate first-pitch strikes along with (1.5SwStr%+CallStr%) and push the r2 even higher than .6152? As it turns out, we can:


This is a graph of (1.5SwStr%+CallStr%+(FS%/6)) and SIERA. We're now up to explaining nearly two-thirds of defense-independent pitching, which is a fairly impressive feat, I think, given that we're merely operating with ball and strike data.

That wraps up the main body of this piece. Next time, I'm going to take a look at some other pitch-based metrics and see how they work alongside SIERA. But before closing, I wanted to take a look at the dot on the far left of that graph, at (31.53, 4.11). That's the lowest x-axis number in the entire sample of 1300 seasons, yet the SIERA that accompanies it is around average. No other season with an x-axis number below 35% had a lower SIERA. How did this happen?

That number belongs to Chien-Ming Wang's 2005 season, his rookie campaign. He had a meager 5.5% SwStr%, just a 14.18% CallStr%, and even a low 54.7% FS%.

What made it work for Wang was that, unlike most pitchers, he really wanted guys to put the ball in play. And that's not in the much-parodied Minnesota Twins "But Nick Blackburn pitches to contact!!!!" sense-it's in a very true sabermetric sense. Wang had a 63.9% groundball ratio that year, and a meager 14.1% line drive rate. While contact is always a worse outcome than a called strike or a swinging strike because something bad can happen, contact off of Wang that year was about as good of an outcome as contact can be. His .265 BABIP was well-deserved given the low line-drive rate-if you think back to the old "xBABIP = LD% + .12" formula, it fits perfectly.

And because Wang almost never missed bats, was rarely wild enough to walk guys, and was predictable enough that batters weren't taking a whole lot of pitches in the zone (74% Z-Swing; average that year was 68%), most of his plate appearances against resulted in contact. And contact was a good outcome for him-not good enough to make him dominate, but enough to get him to league-average in spite of poor performance in the areas studied in this piece.

That goes to show you that there's always more to pitching than can be defined by any set of components. It'll be interesting to watch how Wang's 2005 and other borderline-outlier seasons behave as I add more variables in as this series progresses.