clock menu more-arrow no yes mobile

Filed under:

Stop using K/BB!

Strikeout-to-walk ratio is one of the more commonly-used sabermetric (and traditional) measures for judging pitcher and hitter performance. In my honest opinion, it is a statistic to stay away from. And I'll tell you why.

Kirby Lee-US PRESSWIRE

When evaluating pitchers, some people use ERA, some use wins, others use defense-indpendent statistics like FIP. Scouts use stuff (velocity and movement), ultra-nerds use PITCH f/x -- but for the most part, I stick to simply looking at strikeouts and walks.

I've written a great deal about the value of just using strikeouts and walks before (you can read more here), but today I'd like to educate our readers on a way Ks and walks can be used correctly.

I subscribe to the philosophy that most pitching metrics should move away from using innings pitched in the denominator, and instead use plate appearances (or batters faced). For instance, my metric, predictive FiP (pFIP), uses plate appearances instead of innings pitched.

In October, on this very site, James Gentile wrote a great piece on why K% (K/PA) and BB% (BB/PA) were more useful statistics than their per inning counterparts (K/9 and BB/9).

Here's a solid quote from James, as to why this philosophy makes a lot of sense:

What happens when we use the 'per 9' metrics is that we lose accuracy, because our measurements have become subject to the tyrannical forces of BAbip. As a pitcher allows more Hits per Ball in Play, he becomes less efficient. He ends up facing more batters and getting fewer outs, which consequently means fewer innings. But if he's still striking out batters at the same rate (say 20%) all the while, his K/9 is going to look a lot shinier with those fewer Innings.

I agree with James on pretty much every single point in that piece. I think K% and BB% are the way to go, but I think we should take James's ideas a step further.

There's a commonly cited statistic that attempts to quantify a pitcher's ability to control the strike zone or his ability to strike out batters in comparison to the number of batters he walks.

That statistic is strikeout-to-walk ratio, it's more commonly referred to acronym is K/BB.

The metric is elegant, in that all it considers are strikeouts and walks (which is a good thing in my book), and that it simply takes number of batters a pitcher strikes out and divides that number by the number of batters he walks.

K/BB is featured prominently at sites like FanGraphs and Baseball-Reference.

I used to use K/BB religiously in my evaluation of pitchers, as well as my writing. However, I no longer use the metric.

I'll tell you why.

My primary issue with the statistic is the fact that walks are in the denominator. Strikeouts are just as (if not more) important as walks, however when walks are the denominator, a pitcher who is very good at not walking batters, but not a great strikeout pitcher will end up with a high K/BB. It's simple math.

I prefer to stick with plate appearances as the denominator. Strikeouts minus walks, then divided by plate appearances -- (K-BB)/PA -- makes for a much more valuable statistic, in my opinion.

For those who feel like that statistic is too much work when K/BB is already calculated right there for use, think of (K-BB)/PA in a different sense.

(K-BB)/PA is the same thing as K% minus BB% (both statistics that are readily available). It's simply the percentage of batters that a pitcher strikes out minus the percentage of batters that same pitcher walks.

I'll use three hypothetical comparisons to demonstrate the value of using K% - BB% instead of K/BB.

**For reference, here are the 2012 starting pitcher averages for each of metrics that I'll focus on:

  • K/BB: 2.51
  • K%: 18.7%
  • BB%: 7.4%
  • K%-BB%: 11.3%

Hypothetical #1:

IP

BF

K

BB

K/BB

K%

BB%

K%-BB%

Player A

180

749

200

50

4.00

26.7%

6.7%

20.0%

Player B

180

749

100

25

4.00

13.4%

3.3%

10.0%

Both players threw the same number of innings and had the same above-average K/BB. Yet, Player A has a well above average K%-BB%, while Player B's is below-average.

If we assume that each pitcher gave up the same number of home runs -- for instance, 21 -- then Player A's FIP would be 3.22, seventy points better than player B's 3.92.

I think it's pretty clear that Player A was the more effective/valuable pitcher. Yet, based on their innings pitched and K/BBs, the pitchers look even. This discrepancy occurs because of Player B's very low walk-rate that inflates his K/BB.

Hypothetical #2:

IP

BF

K

BB

K/BB

K%

BB%

K%-BB%

Player X

160

656

150

60

2.50

22.9%

9.1%

13.7%

Player Y

160

656

100

29

3.50

15.2%

4.4%

10.9%

Again, both pitchers have thrown the same number of innings. However, in this comparison Player Y has an above-average K/BB, while Player X's K/BB is about average.

When K/BB is used, one would conclude that Player Y was more effective pitcher, at least in terms of strikeouts and walks.

Yet when we consider K%-BB%, we see that Player X was actually the above-average pitcher in terms of those statistics. This discrepancy, like with the first comparison, comes from the fact that Player Y has a very low number of walks.

As I did in the first comparison, I assumed that each pitcher gave up 21 home runs, and then calculated each pitcher's FIP. Under this assumption, Player X would have a slightly better FIP (4.05 vs. Player Y's 4.09).

Hypothetical #3:

IP

BF

K

BB

K/BB

K%

BB%

K%-BB%

Player Q

175

718

175

92

1.90

24.4%

12.8%

11.6%

Player R

175

718

75

30

2.50

10.5%

4.2%

6.3%

In order to keep from being redundant, I'll be brief with this comparison. Player R had an average K/BB, while Player Q's is below-average. However, Player Q has a much better K%-BB%, and given the 21 home run assumption that we've been working with, Player Q's FIP (4.23) would also best Player R's (4.31).

In order to continue expressing my point, I'd like to step away from hypothetical situations, and look at what actually happened in 2012:

The 2012 Season:

I'll start with a blind comparison of two 2012 starters:

IP

K/BB

K%

BB%

K%-BB%

Player 1

191.1

2.48

27.1%

10.9%

16.2%

Player 2

191

4.88

20.6%

4.2%

16.4%

The pitchers threw roughly the same amount of innings, but Player 2 had an extremely good K/BB, while Player 1 was just average. Their K%-BB%s are almost equivalent though.

Thus, despite what maybe would have been originally assumed, there's a chance that Player 1 was the more effective pitcher.

Who was Player 1? Texas Rangers starter Yu Darvish.

Player 2? Phillies / Dodgers starter Joe Blanton.

I'll leave the comparison at that.

I've also listed the top-10 qualified starters in K/BB, side-by-side with the top-10 starters in terms of K%-BB%:

Pitcher

K/BB

Pitcher

K%-BB%

1. Cliff Lee

7.39

1. Max Scherzer

21.8%

2. Joe Blanton

4.88

2. Cliff Lee

21.1%

3. CC Sabathia

4.48

3. R.A. Dickey

19.0%

4. R.A. Dickey

4.26

4. Cole Hamels

18.9%

5. Cole Hamels

4.15

5. Justin Verlander

18.7%

6. Felix Hernandez

3.98

6. CC Sabathia

18.4%

7. Justin Verlander

3.98

7. Clayton Kershaw

18.4%

8. Jake Peavy

3.96

8. Chris Sale

18.3%

9. Madison Bumgarner

3.90

9. Felix Hernandez

17.8%

10. Wade Miley

3.89

10. James Shields

17.5%

The two leaderboards are almost nearly identical, expect for four pitchers.

K/BB includes, Blanton, Peavy, Bumgarner and Miley, while K%-BB% includes Scherzer, Kershaw, Sale and Shields. I'm only looking at the top-10 in each stat for a single season, but I'll let the readers decide which list they prefer.

Predictive Value

I ran a few linear regressions to test the predictive value of K%-BB% and K/BB.

First, I gathered a sample of starting pitchers who had at least 120 IP in the predictor year, and at least 100 IP in the outcome year from 2004-12. Then I ran a linear regression with K%-BB% as the predictor and RA9 as the dependent variable in the outcome year.

Below, I listed the r-squared for each test:

Predictor

r^2

K%-BB%

0.1792

K/BB

0.0987

The r-squared tells us that K%-BB% explained 17.92 percent of the variation in future RA9, while K/BB explained less than 10 percent of the variation.

I tested a similar sample, but for the years 1996-2004, to see if K%-BB% would continue to beat K/BB, in terms of predictive value, and found these results:

Predictor

r^2

K%-BB%

0.1954

K/BB

0.1496

K%-BB% did not have as large of an advantage based on this sample, but it still beat K/BB by a significant margin.

K%-BB% offers a great improvement, in terms of predicting RA9, over K/BB.

This, in my opinion, lends evidence to the conclusion that subtracting walks from strikeouts and using plate appearances in the denominator is a better reflection of true talent level than dividing strikeouts by walks.

Conclusion

Incorporating hit-by-pitches and intentional walks will also improve this simple measure. Tom Tango has shown this with his strikeouts-and-walks-based ERA estimator (kwERA).

This ERA estimator that subtracts a modified version of walks from strikeouts and then divides that number by plate appearances, stacks up against any of the widely-accepted ERA estimators.

Even if IBBs and HBPs are ignored, I would urge readers to attempt to step away from using K/BB, and move towards always using plate appearances in the denominator.

This will give you a better idea of actual effectiveness and skill when evaluating pitchers.

All statistics come from our friends at FanGraphs.

You can follow Glenn on Twitter @Glenn_DuPaul