Stat-Colored Glasses
A Nibble Here, A Nibble There
We've already looked at the impact different factors have on whether a ball is called a strike or not. MGL suggested that I try and control for the fact that certain pitchers are much more likely to hover around the edge of the plate rather than down the middle or "just a bit outside"</Bob Uecker>.
Generally, any time MGL makes a suggestion, it's a good idea to at least consider it. So I went ahead and re-ran all the same splits except the one based on how a pitcher started a game, using the same approach as in my previous articles. The change is that I only looked at pitches within two ball widths (just under 6 inches) of the edge of the strike zone (either inside or outside the zone).
1 comments | 0 recs
A Strike Is a Strike, Right?
After my last post about catcher framing, many people (at least two or three) suggested that the pitching staff had a fair amount to do with the results we were seeing. The poor performance of both the Rangers' catchers (Gerald Laird and Jarrod Saltalamacchia) seemed to lend credence to that stance. I started wondering what factors might influence an umpire to call a pitch a certain way. So I retreated into my stats cave (no, it's not my mother's basement) for a while to see what I could I find out. I'll warn you, this is a very long post, so if you want to skip to the summary at the end, feel free, but you'll miss the graphs.
11 comments | 1 recs
Framing the Debate
Or perhaps, framing, the debate.
The defensive influence of the catcher has long been debated, with sabermetric circles generally valuing it less than baseball insiders. It's possible to identify seven separate components as places where a catcher can contribute defensively: stolen bases, blocking pitches, blocking the plate, fielding bunts (and psuedo-bunts), game calling, pitch framing and pitcher preparation. Many people have made attempts at quantify different pieces of the overall catcher contribution. Ability to prevent stolen bases and fielding ability are included in a number of measures, such as UZR and Win Shares. Keith Woolner from Baseball Prospectus has examined the topic of game calling a few times and has placed an upper limit of the effect at .60 runs of ERA. I've attempted to measure how well catchers block pitches in the dirt. Blocking the plate is very difficult to measure since it occurs relatively rarely and success can be quite subjective (the runner might be out, but the catcher did nothing to impede him from reaching the plate). Determing a catcher's skill at blocking the plate is probably best left to visual analysis. Pitcher preparation, which I'm using to mean any interaction the catcher has with his staff, from going over game plans, to calming the pitcher during visits to the mound, is another area that would be nearly impossible to objectively measure, but almost certainly has some effect on the game. One area that hasn't been explicitly covered (I suspect many would roll it into game calling) is pitch framing.
I've attempted a study to measure the effects of framing pitches. I say attempted because, to be honest, the results seem wrong to me, but I can't figure out where the issue lies. I present the rest of this article in the hopes that someone can take the information and make sense of it. Some may say I shouldn't publish this until I'm more comfortable with the outcome, but I'm a big fan of collaboration and knowledge sharing, so I'm going ahead even though I don't know the answer. Ok, enough honesty, let's move on.
10 comments | 0 recs
All Playoff Chances Are Still Alive
So your team lost on opening day, don't worry, it really is just another game. To prove that here's a look at the last few World Series teams and how they did on opening day.
|
2007 |
Red Sox | L (1-7) | Rockies | L (6-8) |
| 2006 | Cardinals | W (13-5) | Tigers | W (3-1) |
| 2005 | White Sox | W (1-0) | Astros | L (3-7) |
| 2004 | Red Sox | L (2-7) | Cardinals | L (6-8) |
| 2003 | Marlins | L (5-8) | Yankees | W (8-4) |
World Series teams have went 4-6 over the past five opening days, if nothing else it shows that it's not how you start, but rather how you finish the year, and the likelihood is that a team that lost today could be lifting the World Series trophy come October.
0 comments | 0 recs
Joe Maddon is Equal to Mike Scioscia
That's what a recent ranking by the Wall Street Journal suggests at least. Using categories such as Close Games, Wins Above Expectation, and Player Performance Maddon finished 14th ahead of Jim Leyland, Joe Torre, Terry Francona, Eric Wedge, and Clint Hurdle.
The problem is the system is flawed and for the same reason that ranking general managers or even scouting teams is, uncontrollable variables. Wins Above Expectation, for instance, uses the Pythag Wins total, based on runs scored and allowed, compared to the actual win level. Player Performances are based on, you guessed it, the performances of players under different managers - so Joe Maddon gains huge points for Carlos Pena.
Does anyone believe that Ron Gardenhire (the "top" manager based on the metrics) can truly get more out of players than Maddon, or that somehow his God awful lineups consisting of Nick Punto somehow compelled the Twins to more victories than they essentially "deserved"?
I appreciate the effort, but these rankings are too flawed for me to place any credibility in them. The best example I can give to prove my belief would be Francona's rank, his (loaded) team won the World Series but because Julio Lugo performance was unlucky and poor under Francona somehow that's an indication on Francona's managerial skills? What about Joe Torre's win expectations, because his teams, which featured the highest payroll in the league - presumably meaning the most talent - overachieved (in theory) by 19 games over the past two seasons Torre is somehow credited for that?
Again, nice effort, but the right metric still hasn't been applied.
0 comments | 0 recs
Catcher's Block Percentage: With or Without You
First off, I'd like to thank R.J. for inviting me to join Beyond the Box Score. I hope to be able to contribute some interesting information and insight to a great site. Some topics I plan to examine in the next few months include game calling, pitch framing and clutch performance, as well as commenting on any studies that strike a chord.
For today's post, I'd like to continue my look at catcher's block percentage that I began on my personal blog. For those who haven't read the rest of the series, let me summarize the concept. Basically, I'm using the MLB provided Gameday data from 2005 through 2007 to calculate how each catcher performed in blocking pitches. Wild pitches and passed balls are considered "Misses," while balls in the dirt with runners on base are considered "Opportunities".
To this point, I've only been able to compare catchers to the aggregate league performance. There are a few problems with that approach. The first is data quality. Determining whether a ball was in the dirt is up to the Gameday scorer, and there's no guarantee of consistency. This fact becomes obvious when looking at the differences in opportunities from year to year (Table 1). Unfortunately, there's not much I can do about the data quality beyond acknowledge it may cause some issues.
| Year | Opportunities | Avg. Block % |
| 2005 | 9271 | .84 |
| 2006 | 7375 | .77 |
| 2007 | 11523 | .86 |
The second concern is the impact of the pitching staff. Any given catcher may see easier or harder pitches to handle based on the tendencies of his pitchers. To illustrate this, imagine a staff full of knuckleballers. Whichever catcher is unlucky enough to log innings for that team is going to have a much lower block percentage than average. Of course knuckleballers aren't the only types of pitchers who may cause problems for catchers, they're just the most obvious. To try and isolate the impact of the pitching staff, I attempted a With Or Without You (WOWY) study as outlined by TangoTiger here.
The idea behind this WOWY study is to look at the block percentage for a pitcher throwing to the catcher in question as compared to all other catchers. It's probably best demonstrated with an example. Let's consider Jason Varitek. From 2005 through 2007, Varitek has caught 25 different pitchers. Table 2 is a listing of Misses and Opportunities for the first few pitchers both with and without Varitek catching. From there, we can calculate Varitek's expected Misses and, therefore, how many runs he's saved by blocking pitches.
| Pitcher | With Misses | With Opportunities | Without Misses | Without Opportunities | Expected Misses | Blocks Above Expectation |
| Bronson Arroyo | 6 | 13 | 15 | 55 | 3.5 | -2.5 |
| Josh Beckett | 9 | 76 | 12 | 68 | 13.4 | 4.4 |
| Chad Bradford | 2 | 0 | 0 | 13 | 0 | -2 |
| Matt Clement | 14 | 88 | 0 | 0 | 0 | -14 |
| Lenny DiNardo | 1 | 9 | 3 | 25 | 1.1 | .1 |
| Brendan Donnelly | 1 | 7 | 12 | 24 | 3.5 | 2.5 |
| And so on... | ||||||
Looking at the WOWY numbers for Chad Bradford and Matt Clement, you can see the big issue here. As opposed to Tango's study I referenced above, the sample sizes for the pitcher/catcher matchups is, in a lot of cases, too small to be helpful. Varitek allowed 14 misses in 88 opportunities with Clement, but since no other catcher caught Clement in this time frame, it's impossible to determine how well Varitek compares. In fact, the system credits him with 14 fewer blocks than he should have had. This is obviously not the right answer, considering how Clement traditionally is among the league leaders in wild pitches - a pattern which has held across many teams and catchers.
So what do we do to counter this. The right answer is probably to wait for more data. But that's a bit of a disappointment, so I decided to take a different approach. I threw out the observations with fewer than 20 opportunities on either the with or without side - basically assuming that a catcher performed as expected in those cases. In Varitek's case, that means eliminating all of the entries from the above table except for Josh Beckett, and retaining only 3 of his 25 matchups overall. Why 20? It was just an arbitrary number that seemed to balance two competing sets of sample size concerns - the individual matchups, and the total number of matchups for a given catcher. Someone with more statistical aptitude than I could probably help me with a mathematically proper cutoff, but for now I went ahead with 20.
For those matchups left in the sample, I summed the Blocks Above Expection for each catcher. I then converted that into runs by multiplying by .27 runs per miss, which is the linear weights value for a miss. Finally, I scaled the opportunities to (roughly) 120 games or 238 opportunities (based on the number of opportunites per inning across all three seasons) to allow for easier comparison. Table 3 shows the results for those catchers with 100 or more WOWY opportunities from 2005-2007 seasons. Keep in mind that the numbers aren't anywhere as precise as the decimals make them appear to be - that's just how the math turned out.
| Catcher | WOWY Opps | Blocks | Runs | Runs/120g |
| Mike Lieberthal | 200 | 22.5 | 6.0 | 7.2 |
| Gregg Zaun | 385 | 40.6 | 11.0 | 6.8 |
| Dioner Navarro | 193 | 19.7 | 5.3 | 6.6 |
| Mike Matheny | 188 | 18.2 | 4.9 | 6.2 |
| Mike Redmond | 157 | 14.4 | 3.9 | 5.9 |
| Ronny Paulino | 347 | 27.3 | 7.3 | 5.1 |
| Jason LaRue | 223 | 16.7 | 4.5 | 4.8 |
| Gerald Laird | 259 | 17.0 | 4.6 | 4.2 |
| Brad Ausmus | 379 | 23.2 | 6.3 | 3.9 |
| Jason Varitek | 145 | 6.5 | 1.7 | 2.9 |
| Jeff Mathis | 137 | 5.6 | 1.5 | 2.6 |
| Johnny Estrada | 338 | 13.3 | 3.6 | 2.5 |
| John Buck | 358 | 11.7 | 3.2 | 2.1 |
| Brian McCann | 275 | 8.9 | 2.4 | 2.1 |
| Damian Miller | 433 | 8.4 | 2.3 | 1.2 |
| Paul Lo Duca | 220 | 4.1 | 1.1 | 1.2 |
| Victor Martinez | 118 | 1.7 | 0.5 | 0.9 |
| Jason Kendall | 590 | 6.6 | 1.8 | 0.7 |
| Dave Ross | 159 | 1.7 | 0.5 | 0.7 |
| Toby Hall | 260 | 2.5 | 0.7 | 0.6 |
| Mike Napoli | 313 | 2.5 | 0.7 | 0.5 |
| Sal Fasano | 120 | 0.9 | 0.3 | 0.5 |
| Kenji Johjima | 277 | 0.5 | 0.1 | 0.1 |
| Yadier Molina | 510 | 1.0 | 0.3 | 0.1 |
| Bengie Molina | 331 | -2.2 | -0.6 | -0.4 |
| Brian Schneider | 201 | -1.5 | -0.4 | -0.5 |
| Chris Snyder | 271 | -2.1 | -0.6 | -0.5 |
| Ramon Hernandez | 296 | -2.5 | -0.7 | -0.5 |
| Miguel Olivo | 230 | -2.0 | -0.5 | -0.5 |
| Humberto Cota | 102 | -1.2 | -0.3 | -0.8 |
| Michael Barrett | 273 | -6.1 | -1.6 | -1.4 |
| Matt Treanor | 124 | -3.2 | -0.9 | -1.6 |
| Kurt Suzuki | 102 | -2.9 | -0.8 | -1.8 |
| Jorge Posada | 302 | -10.5 | -2.8 | -2.2 |
| A.J. Pierzynski | 331 | -13.8 | -3.7 | -2.7 |
| Russell Martin | 234 | -11.3 | -3.1 | -3.1 |
| Joe Mauer | 282 | -25.0 | -6.7 | -5.7 |
| Jose Molina | 230 | -22.4 | -6.1 | -6.3 |
| Ivan Rodriguez | 318 | -34.5 | -9.3 | -7.0 |
What does this all tell us? If you're expecting a straight answer out of me here, you're not going to get one. I'll be honest and say I just don't know what it all means. The data quality and sample size issues make me question a lot of it. For those catchers who appear in Table 3, the mean is .9 runs above average and the standard deviation is 3.5 runs. It looks to be a pretty normal distribution, which was also true of the less rigorous analyses I attempted before. I think in many cases the results match reputations, which might lend some credence, but nothing here jumps out at me indicating that there's a definite skill. I think the net is that this information might be useful for determing a catcher's past value, but I wouldn't suggest incorporating it into any projections yet. Hopefully as we get more data from more seasons we'll be able to make more progress in unraveling the value of pitch blocking.
0 comments | 0 recs
Turnbow's ERA: Victim of "Clutch"?
In my last post we looked at Bases Per Inning Pitched (BPIP) and found the correlation between BPIP and ERA to be closer than WHIP and ERA. Today I'd like to dive into the predicted ERAS popped out from BPIP and specifically examine the case of Derrick Turnbow.
Going back to 2004 and setting an innings floor of 50 I collected nearly 1,300 individual seasons and ran their predicted ERAs as well as setting up a "Net" reading to see who was lucky and unlucky. To my surprise Milwaukee Brewers' reliever Derrick Turnbow's 2007 and 206 seasons ranked as two of the three unluckiest seasons in terms of ERA earned against predicted ERA.
Last year Turnbow had a BPIP of 0.94 - giving up less than a base per inning - and had a predicted ERA of 2.07, his real ERA wound up at 4.63, a difference of 2.56. With such a relatively low BPIP I looked into his WHIP numbers, regularly 1.32, with men on, and sure enough found an answer. It seems that Turnbow was a shutdown reliever, until someone get on base, then his WHIP rose to 1.80.
One of the most common sabermetric principles thrown around is that clutch hitting is simply a myth. As far as I'm aware nobody has ever studied the idea of "clutch pitching", but in theory they can amp up their efforts more than a hitter could in certain situations. Without swearing in that direction too much, Turnbow's 2006 numbers showed a 1.69 WHIP overall and a 2.19 WHIP once a runner got on. It's important to note that the runner counts either of his doing or an inherited runner.
So I took it another step further and looked at 2005 - his 2004 season doesn't qualify since he only pitched in six and a third innings. With a WHIP of 1.08 I guessed that his WHIP with runners on was certainly higher, and it was, at 1.48. In each of the previous three seasons Turnbow has seen his WHIP shoot up with runners on by at least a half a runner. Without having another example to provide, it goes back to the Juan Salas/Jay Witasick topic.
Obviously more research is needed to be done, but as I go through the information I'll pass along tidbits like this.
2 comments | 0 recs
Better Than WHIP?
To preface this I must give the credit to Hank Sager for even suggesting such an idea, I'm simply the data and writer guy, he's the brain of the invention. Anyhow, recently he posed the idea of doing a statistic, similar to WHIP, only Total Bases allowed/innings pitched, giving you a number of bases allowed per inning, and seeing if it correlated better to ERA than WHIP did.

Above is the BPI/WHIP/ERA of every pitcher last season with at least 25 innings pitched. I have a few observations I'd like to note:
- Note how WHIP doesn't include HBP.
- BPI implements a certain power aspect, so if a pitcher is only giving up singles he's less likely to give up a run than a pitcher who gives up a two doubles. WHIP doesn't factor in that information.
- In some cases the BPI doesn't seem to match with the ERA, the theory we came up with would be that some pitchers are either simply unlucky or, as Hank himself would say, the pitchers meltdown when runners get on, here's an from our initial research I'd like to provide:
Jay Witasick appears extremely unlucky while Juan Salas seems relatively lucky, to try and guess at what exactly happened here's a look at some important stats.
Stat Salas Witasick
BABIP .259 .299
BABIPwRon .219 .250
WHIPwRon* 0.83 1.21
BPIwRon 1.13 1.53
Note that the innings for WHIP and BPI with runners on are estimations based on the amount of games listed with that scenario, essentially each G=1 IP. By this theory Witasick seemingly remains the same while Salas kicks it up a notch.
I'm going to begin using BPI and monitor it during the year, but what do you guys think?
Hat tip to Cork Gaines for helping out with the graphs and R2 numbers for different situations. Here's a look at the basic correlations between BPI and WHIP with ERA:

Okay and rather than upload the other four charts, although I am grateful, I'll just list the R2s here for each innings plateau:
INN...BPI R2...WHIP R2
50...0.727...0.702
100...0.698...0.638
As Gaines pointed out, it seems like BPI helps predict a starter's ERA better than a relievers, which is going to be something I'll release as a document soon, but I'd like to gather the past few years' data and see what type of consistency exists, if any.
Note: I've been informed that Ron Shandler has something very similar in name at least. Apologies to him, I had no idea.
2 comments | 0 recs
Callix Crabbe is Making Me Proud
Remember back when I did the Rule 5 write-up how I loved the Crabbe pick? It turns out he might just be making the Padres after all, per the official site. Switch hitting versatile guy with a line drive stroke? I don't know too much about his defense but certainly the Padres aren't the only team who could've used such a player at a modest rate.
0 comments | 0 recs
Lame Pun: If Carl Ever Had to Crawl Forward
Hey, how about another pure conjecture post, aye? Let's say Carl Crawford loses all of his speed, okay, not all, but he goes from being one of the fastest men in the game to slightly above average, what type of player would he become? The skill sets of taking walks and hitting homeruns while doing little else is commonly mislabeled as a Moneyball player, but there's also a term thrown around about these type of players, "old player skills". The theory is players walk and hit homers more towards the middle and end of their careers as such things as speed deteriorate.
Of Crawford's 1,469 total bases roughly 53% have came via doubles, triples, or stolen bases and 22% of his total hits are doubles or triples. Compared to a power hitter, Jonny Gomes let's say, who had 31% of his total bases invested in "speed" plays and 24% of his hits. The difference of course is the emphasized category of hit - Crawford had as many triples this season as Gomes has in his career.
We often talk about Batting Average on Balls in Play and Line Drive percentage, using the latter as a predictor of the former. Carl's speed obviously affects the defensive efficiency when it comes to dealing with him; take a look below at his "luck" and infield hit percentage to get an idea of how prone Carl is to getting a hit that doesn't even creep beyond the infield dirt.

Remember Carl also hit two inside the park homeruns, let's presume that he's stretched some singles into doubles as well, essentially we're taking the added bases away.
Total bases: 1,469
Subtracting projected added bases: 1,269
Delta: 200 bases/50 "runs"
So assuming that Carl's BABIP would regress towards the mean without his legs by taking away an estimated amount of bases we'd transform his line into something like this: .253/.292/.380. Now that line is quite unfair to Carl since we're assuming he'd maintain the same hit tendencies without his most vital weapon, so to give some credit ot the theory of a baseball player evolution let's assume a few of Carl's triples and doubles become homeruns, a few of his singles doubles, ect. while maintaining that his speed is still average to slightly above.
If Carl were to add 90 bases through a combination of doubles and homeruns his slugging would move to .405, and assuming that his eye would improve in inverse to the pitcher's aggressiveness lifting his batting average, but more importantly his OBP into at least the .320 range you have a .725 OPS producing left fielder, placing him below league average last season for all sevens.
Obviously this is far from an exact science, but if nothing else I hope we've opened up the thought process on just how much of Carl's game is speed based.
0 comments | 0 recs
Showing 1 - 10 of 75Older








