Nationals 2008 Base-Stealing WPA
My analysis of win expectancy added and lost on the basepaths in 2008, which finds that in context the Nationals hurt themselves even more than their raw 65.3% stolen base success rate would suggest.
I haven't seen anyone do this kind of analysis before, so if this is a kinda new idea, folks can feel free to copy with their own teams.
6 months ago
FireJimBowden
6 comments
0 recs |
Comments
I'm doing something similar
Writing software to break down each run from 2008 and allocate to hitters, baserunners, fielders and pitchers. I’m using MLB Gameday play by play data and expected runs for each situation. I’ve completed offense (hitting/running), and I’m now working on defense (fielding/pitching). I’ll post the final results when I’m done (defense proving more difficult than expected), but contact me and I’ll send you more details.
One of my goals was to account for all aspects of baserunning, including stolen bases, caught stealing, outs made on the basepaths (other than force outs / fielders choice), and taking (or not taking) the extra base.
The way I calculated runs, everything is relative to average (expected) runs scored per inning. For offense, all runs are considered /- in hitting, baserunning, or fielding errors (can’t really give hitter or runner credit when a fielder commits an error). Because errors are responsible for 42-73 ER, depending on the team, and because over all teams the sum of all hitting/running/error runs must equal 0.0 (average), most hitting and running totals are negative. In fact, only one team (Minnesota) was positive in both hitting and running (4.9, +6.1 ER respectively).
Since you’re most interested in Washington, I had them dead last hitting, at -166.9 ER, just behind San Francisco (-158.4) and San Diego (-158.2). Baserunning, I had Washington at -11.0 ER, 24th in the majors, just behind Boston. As you state, the Nats are hurt by their 43 CS / 65% success rate. They are also hurt by being relatively station to station on the basepaths, taking a total of 369 “extra bases” (23rd in the league). By comparison, Minnesota was by far the most aggressive on the basepaths, with 533 “extra bases” (Texas was second with 472). Washington was helped by not committing many outs on the basepaths (OOBP – outside of caught stealing), being thrown out 44 times, 3rd fewest in the majors.
FYI – I defined an “extra base” as a situation where a baserunner advances on a batted ball when league-wide there is a less than 50% chance of the baserunner advancing, given the hit type (LD, FB, GB) and location (fielder at the moment, soon to be zone). For example over the entire 2008 season, on a GB fielded by the left fielder, the runner advanced 224 times, stayed at second 998 times and was thrown out 30 times. Each of those 224 advances are considered “extra bases”.
by Adam Peterson on
Jan 4, 2009 8:57 AM EST
reply
actions
0 recs
Wow, I can't wait to see the end result of this.
---
Juuuust a bit outside!!
http://www.rightfieldbleachers.com
by jhmoore on
Jan 5, 2009 1:07 AM EST
up
reply
actions
0 recs
I definitely disagree with no crediting offense when an error is made.
Bad fielders allow hitters to reach base all the time: how is lack of range different from bad hands? Hitters should get SOME credit for putting the ball in play. Sure, it might be pretty lucky when errors are made, but speed and batted ball speed have been shown to definitely induce more errors.
Overall, I like the idea. Here’s hoping to write a full explanation of methodology.
Beyond the Boxscore // Calling BJ Upton lazy is lazy.
by Sky Kalkman on
Jan 5, 2009 1:00 PM EST
up
reply
actions
0 recs
Funny you should mention that
I plan to write up a more complete explanation of methodology and open up to discussion prior to completing the software / analysis. I see a Fan Post in my future…
Regarding allocation of errors, I struggled with this when assessing offense. The whole basis of my software is that it breaks each play into a series of “transitions” from one base situation (and associated ER) to another. Currently, as I note above, I allocate a GB fielding error by breaking into two transitions. First, the batted ball is recorded as an out (the play the fielder should have made) and allocated to the fielder. Second, the error is recorded as a separate transition and assigned to the fielders. After the batted ball is processed, then I focus on baserunning, explained above.
Defensively, I’m doing things a bit differently. Each play transition is allocated either to the pitcher or to fielder(s) (multiple in case of double play). For a batted ball, I first allocate ER to the pitcher based on type (LD, GB, FB, pop fly, bunt) and defensive zone (standard 22 zones) into which the ball was hit. Again, I use league-wide analysis to determine “expected” outcomes (really a probabilistic weighting of outcomes given type-zone pair). Second, I allocate the fielder the remaining ER difference. In practice, this has the effect of rewarding pitchers for giving up GB/PF and penalizing for LD/FB. I understand that the pitcher has minimal control over which zone a ball is hit into, but since I’ve defined a zero-sum offense-defense allocation, if a ball is grounded up the middle it makes more sense for the pitcher to be penalized than the 2B/SS.
Sky, back to your point. As I’ve worked through defensive plays, I have repeatedly come back to one point. For a batted ball (before baserunning and fielding is considered), the ER allocated to the pitcher and hitter should add up to zero. In efffect, rather than my current hitting allocation based on result (thus dependent on fielding…), I can base it the way I do for pitchers now. A hitter is rewarded for a LD, penalized for a popup, regardless of what the fielder(s) do. I believe this will address your concern that the hitter should get credit for putting the ball in play. Call it “fielding independent hitting” that takes zone into consideration…thoughts?
by Adam Peterson on
Jan 5, 2009 4:16 PM EST
up
reply
actions
0 recs
This combination of pitching/fielding/hitting credit has always stumped me.
I totally agree with your split between pitching and fielding. Ideally, you’d use UZR applied to individual plays to split credit, not just the large zones. All good.
But for hitting, does it really make sense to include the fielder’s piece (i.e. hitting = pitching + fielding)? That credits/blames a hitter for how a specific fielder performs instead of crediting the hitter for how an average fielder would perform. But then you lose symmetry, at least on the event and game level. On the season level, all three pieces average out by definition. And then there’s the issue of positioning fielders. Giambi’s at-bats are way different, and something that a system like UZR would miss on, right? You just can’t assume average-ness on anything, even for pitchers.
That’s some rambling, hopefully containing some food for thought.
Beyond the Boxscore // Calling BJ Upton lazy is lazy.
by Sky Kalkman on
Jan 5, 2009 5:05 PM EST
up
reply
actions
0 recs
That's the point
Right now, what I’m effectively doing is hitting = pitching + fielding with errors treated separately. What I should do is hitting = pitching. The problem is that I need to have the total ER allocated to one team’s offense + ER allocated to the other team’s defense = 0. That introduces a “fielding” component to the offensive side’s runs scored. Then again, this should generally balance out across the league depending on the defensive quality of the teams faced. Would be another interesting piece to analyze.
I’d love to use UZR rather than the standard defensive zones, but without the MGL model I have to do what I can. Also, because Gameday hit charts contain locations where the ball is fielded (rather than where it lands), it is not possible to determine whether an individual hit was not possible to field, but again I’m looking for totals generally across the season. It will be possible to create a unique box score for a game, including hitting, running, fielding for each position player…we’ll see how it turns out.
Positioning fielders is another problem. Don’t have access to this data yet, so I use the standard zones unless someone has a better idea.
by Adam Peterson on
Jan 5, 2009 8:29 PM EST
up
reply
actions
0 recs









BtB on Facebook
BtB_Sky on Twitter




