/cdn.vox-cdn.com/uploads/chorus_image/image/13553237/20130512_mbr_sr6_225.0.jpg)
Introduction
Catcher defense (or just defense in general) is the next frontier for sabermetrics and many sites have their own metrics to analyze it. I enjoy former Beyond the Box Score contributor Matt Klaassen's catcher defense ratings, which takes caught stealing, wild pitches, and errors into account. Catcher defense is a fairly complex problem with variables ranging from framing to pitch-blocking to receiving, but another bullet on the catcher's job description is to throw out opposing baserunners trying to steal a base. We can look at caught stealing numbers, which is all fine and dandy, but what if we have a catcher who is really really good at it? Won't he have fewer opportunities to throw out runners because they're afraid of him catching them? Let's call this hypothetical catcher Schmadier Schmolina. Word gets around the league that he has an L115A3 AWM sniper rifle instead of a right arm and runners just stop trying to take the next base. What effect does this have on Schmolina's value and how do we measure it?
Method
Baseball reference has mounds of information, more than any one person could analyze in a lifetime. I found one interesting speck of data: Player Advanced Fielding -- Catcher Baserunning. This gives the stolen base opportunities ("Plate appearances through which a runner was on first or second with the next base open"), stolen bases and caught stealing for every catcher in each year. I compiled all of this data from 1945 through May 21, 2013 and...boom! a catcher stolen base database.
As we all know, catchers are not the only part of the running game. Team philosophies (both offense and defense), runner speed, pitcher handedness and ability, count, and score are some of the other variables that affect whether or not a runner takes off for second (or third or home). These numbers will not reflect that, but I would love to create a complete algorithm that puts all of those variables into place. The effect of these missing variables is best seen with Rod Barajas's magically intimidating 2010 season. Barajas was never known as a particularly strong-armed catcher, with a career average arm accuracy of 36 out of 100 on the Fans Scouting Report (FSR). According to the Intimidation results, we find he actually shut down the running game a bit better than that during his career, but he has one standout year in 2010, where runners stayed put with him behind the plate over twice the league average. This is not solely because they were afraid of his arm, it probably has something to do with the New York Mets' pitcher staff consisting of lefties Johan Santana, Jonathon Niese, and Hisanori Takahashi. I also believe pitching coach Dan Warthen made this a focus of the team, as there was a spike in depressing stolen base attempt totals once he joined the Mets in 2008.
That example gave away my method: take stolen base attempts against a catcher, divide it by total opportunities, and compare it to the league average that year, where 100 is league average and a higher number means more runners stayed put. I have named this statistic "Catcher Intimidation."
At this point, it would be wonderful to do a With-or-Without-You method to help get rid of some of those earlier variables. This may be possible with my data set and I will try to make it work in the future.
I took an idea from Klaassen and used stolen bases plus catcher caught stealing as the total amount of stolen base attempts. This removes pickoffs, which are almost completely a result of the pitcher's talent. Although, maybe a runner is more likely to take a longer lead with a worse catcher behind the plate, making him more likely to get picked off. Nevertheless, no pickoffs.
The method is simple and the results...are simply riveting.
Results
Using a minimum of 300 stolen base opportunities (SBOs) and ranking by career Intimidation, at the top with a score of 193, we find St. Louis Cardinals backstop...Mike Mahoney. Let's try that again, but put the minimum at 1,000 and remove obvious back-up types. I found that catchers like Omir Santos, Joel Skinner, and Sandy Martinez heavily influenced this list. They succeeded in crossing the 1,000 SBO threshold, but rarely, if ever, had to face the prospect of playing every day. Perhaps they really did shut down the running game and this is the reason they were allowed to stay in the majors for a while.
Catcher | Avg Year | SBO | Intimidation |
Yadier Molina | 2009 | 14228 | 168 |
Johnny Bench | 1975 | 22499 | 142 |
Ivan Rodriguez | 2002 | 33735 | 139 |
Salvador Perez | 2012 | 2034 | 137 |
Ron Karkovice | 1992 | 11431 | 136 |
Joe Mauer | 2009 | 11474 | 134 |
Matt Wieters | 2011 | 7170 | 132 |
Charles Johnson | 1999 | 16272 | 129 |
John Buck | 2009 | 13213 | 128 |
Dan Wilson | 1999 | 16951 | 128 |
Ed Bailey | 1960 | 13917 | 125 |
Mike Macfarlane | 1993 | 13299 | 124 |
Elston Howard | 1962 | 14328 | 124 |
Dave Rader | 1976 | 9530 | 124 |
Bob Boone | 1981 | 28789 | 123 |
Brad Ausmus | 2001 | 25106 | 121 |
Wilin Rosario | 2012 | 2175 | 121 |
Roy Campanella | 1953 | 15932 | 121 |
Miguel Montero | 2010 | 8138 | 120 |
Jim Sundberg | 1982 | 25070 | 120 |
Buddy Rosar | 1948 | 4151 | 120 |
Mike Lieberthal | 2001 | 15620 | 120 |
Thurman Munson | 1974 | 17015 | 120 |
Kenji Johjima | 2008 | 6101 | 120 |
Butch Wynegar | 1982 | 16945 | 119 |
There are a lot of active catchers on that list, perhaps because they have yet to reach their decline. Molina jumps way up to the top, distancing himself far above Bench and Rodriguez. Perez, Mauer, Wieters, Buck, Rosario, and Montero all still have time to prove themselves worthy of this ranking.
Here are the least intimidating catchers, those whom runners are not afraid to run on, again with a 1,000 SBO minimum and hopefully removing most backup catchers:
Catcher | Avg Year | SBO | Intimidation |
Paul Lo Duca | 2003 | 11969 | 73 |
Craig Biggio | 1993 | 5513 | 73 |
Jason Varitek | 2005 | 18916 | 75 |
Mike Piazza | 1999 | 20740 | 77 |
Jorge Posada | 2003 | 20696 | 79 |
Bruce Benedict | 1984 | 12171 | 80 |
Victor Martinez | 2007 | 11227 | 80 |
Alan Ashby | 1981 | 16249 | 81 |
Dave Duncan | 1971 | 10999 | 81 |
Ray Lamanno | 1947 | 1940 | 83 |
Darrin Fletcher | 1996 | 14071 | 84 |
Nick Hundley | 2011 | 4946 | 86 |
John Stearns | 1979 | 9115 | 87 |
Terry Kennedy | 1985 | 17273 | 88 |
Brandon Inge | 2004 | 5220 | 88 |
Buster Posey | 2011 | 3428 | 88 |
Bengie Molina | 2004 | 17021 | 89 |
Jody Davis | 1986 | 13546 | 89 |
Jason Kendall | 2003 | 28673 | 89 |
Ed Taubensee | 1996 | 10591 | 91 |
Geovany Soto | 2010 | 7826 | 91 |
Mike Napoli | 2009 | 6759 | 91 |
Brian McCann | 2009 | 12479 | 92 |
Gary Carter | 1983 | 25940 | 92 |
Todd Hundley | 1997 | 13487 | 92 |
Wes Westrum | 1953 | 8781 | 92 |
A lot of these players spent time at other positions, mostly because their bat was good but their glove wasn't. Hundley ranks well on the Fans Scouting Report with a 61 for arm strength and a 51 for accuracy, but ranks historically low in catcher intimidation, though this may be part of the Padres philosophy. It is difficult to separate the two, since Hundley has been the most-used catcher over the years he has been with the team. However, in 2010 when Yorvit Torrealba wore the tools of ignorance more often than Hundley, the Padres did perform better as a team preventing stolen base attempts.
The Fans Scouting Report is a good measure to use in this instance because it relies on individuals creating a reputation for a certain player and ranking him according to his peers. Let's compare release, arm strength, arm accuracy, and overall rating to catcher intimidation factor and see what we find.
FSR is only available back to 2009, so the sample size is pretty small. Once we remove catchers with fewer than 130 SBOs, there are only 317 points of data to use. After running the multiple regression, I found that the only significant variable was catcher arm accuracy, but the correlation coefficient was only 0.06. Not much there.
Perhaps a counting statistic will help here. Intimidation is like OPS+, meaning a catcher with 5 stolen base opportunities and 0 stolen bases will have an infinite Intimidation, solely because he just wasn't on the field that much. Setting minimums helps to a point, but how can we weight players with more playing time? By calculating the total amount of stolen base attempts missing against a catcher. This is found as Stolen Base Attempts minus Stolen Base Opportunities times the league average Stolen Base Attempt Rate (SBA - SBO*lgSBAR).
This is where we make the leap from an abstract idea like "intimidation" to a concrete effect -- missing stolen bases. Losing things is something that everyone can connect with. As George Carlin once said, "I don't like to lose anything, because--where is it?"
Let's look at the catchers with the most stolen bases in the big pile of missing stuff:
Catcher | Avg Year | SBO | Extra SBA | Intimidation |
Ivan Rodriguez | 2002 | 33735 | -697 | 139 |
Johnny Bench | 1975 | 22499 | -362 | 142 |
Bob Boone | 1981 | 28789 | -352 | 123 |
Yadier Molina | 2009 | 14228 | -316 | 168 |
Benito Santiago | 1996 | 25144 | -278 | 115 |
Jim Sundberg | 1982 | 25070 | -266 | 120 |
Rick Dempsey | 1980 | 19181 | -243 | 125 |
Carlton Fisk | 1981 | 29111 | -226 | 116 |
Rick Cerone | 1984 | 16294 | -198 | 114 |
Butch Wynegar | 1982 | 16945 | -198 | 119 |
Mike Heath | 1985 | 13881 | -191 | 117 |
Brad Ausmus | 2001 | 25106 | -190 | 121 |
Dan Wilson | 1999 | 16951 | -188 | 128 |
Lance Parrish | 1986 | 23897 | -188 | 116 |
Mike Macfarlane | 1993 | 13299 | -177 | 124 |
Ron Karkovice | 1992 | 11431 | -174 | 136 |
Terry Steinbach | 1993 | 18460 | -169 | 115 |
Charles Johnson | 1999 | 16272 | -162 | 129 |
Ernie Whitt | 1984 | 14870 | -156 | 110 |
John Roseboro | 1964 | 18556 | -153 | 112 |
John Buck | 2009 | 13213 | -150 | 128 |
Thurman Munson | 1974 | 17015 | -150 | 120 |
Buck Martinez | 1978 | 11611 | -147 | 125 |
Joe Mauer | 2009 | 11474 | -145 | 134 |
Jason LaRue | 2005 | 11264 | -138 | 120 |
Rodriguez outpaces everyone by almost two times, but he easily has the most opportunities. There are a lot of familiar faces here, but fewer active catchers. Bench, Roseboro, Munson, and Martinez are the oldest catchers on here and none come before that time period. This is because fewer games were played in the 1940s than in today's league, meaning catchers had fewer opportunities to rack up missing stolen bases.
And the catchers with the most extra stolen base attempts:
Catcher | Avg Year | SBO | Extra SBA | Intimidation |
Mike Piazza | 1999 | 20740 | 458 | 77 |
Alan Ashby | 1981 | 16249 | 274 | 81 |
Mike Fitzgerald | 1988 | 8563 | 261 | 73 |
Gary Carter | 1983 | 25940 | 258 | 92 |
Jason Varitek | 2005 | 18916 | 205 | 75 |
Paul Lo Duca | 2003 | 11969 | 185 | 73 |
Bruce Benedict | 1984 | 12171 | 183 | 80 |
Ozzie Virgil | 1978 | 8967 | 175 | 78 |
Darrin Fletcher | 1996 | 14071 | 169 | 84 |
Jody Davis | 1986 | 13546 | 158 | 89 |
Jorge Posada | 2003 | 20696 | 155 | 79 |
Victor Martinez | 2007 | 11227 | 152 | 80 |
Scott Hatteberg | 1999 | 4717 | 150 | 68 |
Doug Mirabelli | 2002 | 5819 | 142 | 70 |
Ted Simmons | 1978 | 23788 | 138 | 102 |
Biff Pocoroba | 1979 | 5080 | 128 | 72 |
Craig Biggio | 1993 | 5513 | 128 | 73 |
Todd Hundley | 1997 | 13487 | 126 | 92 |
Tim Blackwell | 1978 | 4826 | 118 | 76 |
Milt May | 1978 | 13608 | 116 | 94 |
Jeff Reed | 1992 | 12301 | 113 | 97 |
Bob Brenly | 1985 | 8208 | 110 | 81 |
Mackey Sasser | 1991 | 2602 | 109 | 63 |
Terry Kennedy | 1985 | 17273 | 106 | 88 |
Mike LaValliere | 1990 | 9717 | 97 | 89 |
Piazza sticks out on the bad side almost as much as Rodriguez does on the good side. He is the Darth Vader to Rodriguez's Luke Skywalker (except Pudge is only three years younger than Piazza, but shh). However, If I was running a team and my catcher could play passable defense while also hitting for over 400 home runs and a 0.390 career wOBA, I would keep putting him out there too, even if it meant the opposing team would run more often (which might not even be a bad thing!). If you only knew the power of the dark side.
So what happens when we try to correlate extra stolen base attempts with Fans Scouting report data? Again, the only significant variable is arm accuracy, but this time the results are better, with a correlation coefficient of 0.19.
The extremes are well taken care of (low ranked catchers have extra stolen bases or none at all due to no playing time and high ranked catchers have much fewer stolen base attempts), but rankings anywhere between 30 to 80 are open to variability. For instance, in 2012, Buster Posey had an arm accuracy of 81, but still had 30 extra stolen bases, just like the 45 accuracy Jorge Posada had in 2009.
Also in 2009, Jason Kendall had an arm accuracy rating of only 33, but had 34 fewer stolen base attempts than expected. Yadier Molina had 33 fewer attempts than expected in 2010 with a 100 rating.
There is a lot more discussion to be had from this data set like career arcs, league averages, and team philosophies, but I'll cut myself off here and will pick it up again in a future post.
Summary
To get back to the original question, how much value should we give or take away from a catcher because of the reputation of his arm? I don't have an exact answer for that and am open to suggestions, but it can be up to a 60 attempt swing per season and it does appear to be a repeatable talent, especially for extremely good or bad catchers. It's important to note that reputation may not be a positive thing, as stated by Tom Tango:
"Unfortunately, since most teams attempt to steal too often, they’d likely do better by never stealing at all. Hence, a catcher that is so good that he never gets a runner to steal is not being helpful."
Final question: when finding the amount of value added or subtracted from the catcher's reputation, what caught stealing rate would one use? If a runner doesn't run, it's probably because he is more likely to get caught, so shouldn't we assume a higher caught stealing rate for missing stolen base attempts? On the flip side, if runners steal more often, it's probably because they're more likely to be successful, so the caught stealing rate should be lower on extra stolen base attempts.
In lieu of the perfect attempt rate assumption, I may run a series of assumptions and analyze the differences between the results.