Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: 2012 NBA All-Star Game Starters Announced

Saber-Friendly Blogging 101: Pitching RAR

(Short version: download the 2008 RAR data for starting pitchers.)

Saber-Friendly Blogging 101 is my attempt to give team-specific bloggers article ideas and the data necessary to write their own saber-friendly articles -- the articles I want to read, but can't find enough of.  In the first installment, I took a look at BABIP, and what it can tell you about which pitchers were possibly lucky or unlucky in 2008.  Michael Taylor of Tribe Report did a nice job running with the concept.  But we can go a step further than just looking at BABIP -- actually a few steps further.  By taking all the things we know are under a pitcher's control (and only those things), we can estimate what a pitcher's ERA should have been, all else being equal*.

One basic statistic that estimates true-skill ERA is FIP (Fielding Independent Pitching).  It was created by Tom Tango and uses a basic arithmetic formula using K's, BB's, and HRs: (HR*13+(BB-IBB+HBP)*3-K*2) / IP + 3.20.  It works quite well and is available at both The Hardball Times and Fangraphs.  The Hardball Times has another similar statistic called xFIP, which uses a modified home run total instead of actual home runs.  As the semi-accurate cliche goes, pitchers allow fly balls, but hitters turn those fly balls into home runs.  Therefore xFIP uses the league-average home run-per-flyball rate combined with each pitcher's fly ball rate to estimate how many home runs a pitcher "deserved" to give up.

But the most advanced pitching statistic available just popped up this summer over at StatCorner, although there has yet to be a study to show that it's actually better than FIP or xFIP or even ERA.  (Many people assume it is, though.)  It's called tRA and uses eight categories of outcomes that are strongly under pitcher control: Ks, BBs, HBPs, HRs, GB%, LD%, OF FB%, and IF FB%.  In one sentence, tRA credits pitchers for their ability to induce those eight events, without caring about the actual outcomes of the balls hit into play.  And everything's park-adjusted.  For a longer explanation, read this.  For a no-numbers explanation, try this.

Ok, so let's assume we have this special number, tRA, that best represents a pitcher's true demonstrated skill.  (I actually add two adjustments -- one to account for NL pitchers not facing a DH and another to put it on the ERA scale -- and call it tERA.)  What can we do with it?  Well, we can value the production of pitchers, of course.  If tERA is our measure of quality, we next need to measure quantity.  Inning pitched is the obviously solution, although I prefer Statcorner's expected innings pitched (xIP).  Why?  Because if a pitcher is unlucky and extra balls are falling in for hits, he's getting docked outs and being credited with fewer innings than he deserves.

To measure a pitcher's total production over replacement-level, we compare his tERA to the replacement-level tERA of 5.75, divide by nine to put the savings on a per-inning basis, and multiply by the number of expected innings he pitched.  For example, Cliff Lee had a 2.64 tERA and 222 xIP in 2008.  His RAR is (5.75 - 2.64) / 9 * 222 = 77.  That production compares favorably to every position player except Albert Pujols, by the way.

What's that?  You want all the relevant tERA, xIP, and RAR information for your favorite team's starters?  Well, here you go. The data tab separates out contributions to different teams (thus, CC Sabathia will be listed twice) and the player pivot table allows you to select just the pitchers on any one team.  The team pivot table shows the total value provided by each team's rotations.

Ideas for a team-specific article:

  • Explain why tERA is a better measure of pitcher value than ERA (it removes fielding, ballpark effects, luck, etc.)  Also explain it's limitations (see below).
  • Present the xIP, tERA and RAR info for all starters on the team.
  • Present the same data for the projected 2009 rotation.  You can pro-rate the RAR numbers to different innings totals based on 2009 projections.  Or compute them yourself given whatever ERA and IP projections you want and the RAR formula.
  • Take a look at how your team's rotation stacked up against the other teams in MLB or in their own league in 2008.
  • Discuss any potential free agent signings or trade targets in terms of their 2008 value.  Compare their 2008 tERAs to their actual ERAs to see if they're coming off seasons that were underrated or overrated.

For fun, here's the majors' best rotation in 2008, the Arizona Diamondbacks'.  Remember, their park is one of the more hitter-friendly parks in the majors, and their fielders were below average by thirty runs according to UZR. 

Name xIP tERA RAR
Brandon Webb 231.3 3.10 68
Dan Haren 219.3 3.22 62
Randy Johnson 193.3 3.35 51
Doug Davis 151.0 4.36 23
Micah Owings 103.3 4.86 10
Max Scherzer 38.7 3.76 9
Yusmeiro Petit 40.7 4.27 7
Edgar Gonzalez 28.3 5.26 2

* "All else being equal" is a decent, but imperfect, assumption.  For example, some pitchers allow groundballs that are easier to find than other pitchers.  And some pitchers better adapt to situations and can change their approach when needed.  The effect of these other things are generally small, but they can become significant at the etremes.  The next stage of research will probably be aimed at picking apart these issues.

Comment 39 comments  |  1 recs  | 

Do you like this story?

Comments

Display:

awesome stuff

you’re pumping out stuff faster than I can assimilate it, must less right articles about it!

Can I ask again: for relievers (retrospectively), for past performance can one just multiply the RAR by the pLI from Fangraphs? Sorry to repeat the question.

Also, is replacement level ERA and FIP lvAVG*.128 for starter and 1.07 for relievers? How did you get the 5.75 for tRA?

OMG Banny. FWIW I am only crdtng u w/3 runs allwd bc of DDJ OMFG

by Matt Klaassen on Dec 9, 2008 11:36 AM EST reply actions  

"write" articles

I can’t type

OMG Banny. FWIW I am only crdtng u w/3 runs allwd bc of DDJ OMFG

by Matt Klaassen on Dec 9, 2008 11:36 AM EST up reply actions  

5.75 and 4.75 are estimates I've had in my head for a while, probably from MGL or Tango.

1.07 and 1.28 times league-ERA are probably better, although 4.50 × 1.28 is 5.76 and 4.50 × 1.07 is 4.81.

Yes, for historical value, I just multiply RAR by the reliever’s leverage. That won’t address every definition of value out there, but is good for many of them. And when you do that, you’re not saying that a pitcher prevented that many runs, you’re saying that the runs he prevented were as effective as that many unleveraged runs. Converting from runs to wins removes any of that confusion.

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Dec 9, 2008 2:30 PM EST up reply actions  

Thanks

If we assume a “2” pLI for relievers, then F-Rod, using his best stat (ERA), is still not quite a 2WAR player. Even if he doesn’t decline at all, 3/37 is too much.

OMG Banny. FWIW I am only crdtng u w/3 runs allwd bc of DDJ OMFG

by Matt Klaassen on Dec 9, 2008 4:14 PM EST up reply actions  

No, it's more than that...

Assume (and feel free to adjust as necessary)
2.75 ERA
2.0 LI
72 IP

RAR = (4.75 – 2.75) / 9 × 72 × 2.0 = 3.2 WAR

And after looking more closely for the past few days, 1.8 to 1.9 seems to be typical leverage for a top 15 closer.

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Dec 9, 2008 5:17 PM EST up reply actions  

More Ideas

Why only do MLB starters, do the whole organization by level, and you can do relievers by level too. Then go back and do all the organizations starters using tRA+ to show how they compared to other starters in their league.

http://www.raysprospects.com/

by DAM on Dec 9, 2008 12:26 PM EST reply actions  

Technical critiques

Sky, this may be more of an issue for Graham, but I’ll post it here. HR/PA may be a quasi-reliable stat (it usually has a reliability in the mid .30s, which is good-but-not-great.) However, when you look, flyball rate is pretty reliable (.70 or so) at a reasonable BFP minimum, but HR/FB is very unreliable. My interpretation of that is that the pitcher gives up the flyball, but the batter hits it out of the park.

A good study on whether tRA is better than FIP or any of the others would be cool, but I would still want to pull HR rate out of that equation.

http://mvn.com/mlb-stats

by pizzacutter on Dec 9, 2008 4:04 PM EST reply actions  

I wouldn't.

Those home runs happened, and ignoring them means you miss out on what the pitcher did.

We regress HR/BIA heavily for tRA*.

by Graham MacAree on Dec 9, 2008 4:45 PM EST up reply actions  

Regressing is good... but

I could make the case that you have to regress so much that everyone ends up being mostly league avg. The HR did happen, but so did those groundball singles Pasta Diving Jeter.

http://mvn.com/mlb-stats

by pizzacutter on Dec 9, 2008 6:21 PM EST up reply actions  

I go back and forth on this.

You obviously want to regress HR/FB rate for projections. For historical value, it’s murky. Sure, the home runs “actually happened”, but so did a hit on a ground ball. On the groundball, we’re ignoring the result, though, for a few reasons, mainly fielding interactions and “luck” of batted ball location. The first is missing from HR/FB, but not the second.

In other words, there are MANY points on the spectrum between results-oriented and true-talent. Something like this:

ERA | compERA| PZR | tRA | FIP | “xtRA” | xFIP | tRA*

None are perfect, all have different uses, and I’m not really sure which one is best for a RAR stat. Personally, I think it’s somewhere in the PZR to xtRA range. (xtRA is my made-up term for tRA with the HR piece regressed, like xFIP.)

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Dec 9, 2008 5:14 PM EST up reply actions  

Depends

tRA doesn’t try to be luck independent, it tries to be defence and park independent. So yeah, fielders are what it cares about.

If you want to regress HR/BIA you might as well regress everything else too, which as you know gets you tRA*.

by Graham MacAree on Dec 9, 2008 6:24 PM EST up reply actions  

True.

Let’s try another approach. The “perfect” HR park factor would know exactly how far each of a pitcher’s flyballs (and liners, I guess) traveled and would know at how many parks they would be a home run, and credit the pitcher with partial home runs on every deep flyball. We obviously don’t have that information, so we try a ratio park factor for homeruns. But might a combination of regressing HR/BIA and using a ratio park factor get us closer to the “perfect” adjusted HR total?

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Dec 9, 2008 6:30 PM EST up reply actions  

Well, yes

But you could say the same thing of every other pitching stat, couldn’t you? HR/BIA is the least stable of the tRA inputs, but still.

by Graham MacAree on Dec 9, 2008 6:33 PM EST up reply actions  

Could you say the same?

I guess so, although it seems that the “perfect HR park factor” would be the most different from its traditional park factor.

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Dec 9, 2008 6:37 PM EST up reply actions  

So here's my take:

Without the ability to actually measure the “perfect” HR park factor, trying to come up with an approximation puts us into murkier and murkier waters without a real guide to where we should be, which I don’t like.

I really wish I had accurate data wrt hit vectors, though.

by Graham MacAree on Dec 9, 2008 6:50 PM EST up reply actions  

Obviously this only works on actual homers

But could you incorporate HitTracker’s information and get an idea of whether the homer is a “true” homer or not?

by Dan Turkenkopf on Dec 9, 2008 7:29 PM EST up reply actions  

Thanks a lot

Even in its brief life this series has been incredibly helpful.

Just wanted to extend my gratitude.

by rivercityredbird on Dec 9, 2008 6:22 PM EST reply actions  

Thanks.

I still don’t think it’s caught on as a more than a primer, though. I mean, I only know of one team article written as a result of me writing these articles. I was hoping I would have more good team-specific stuff to read…

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Dec 9, 2008 6:31 PM EST up reply actions  

I'm getting to it

OMG Banny. FWIW I am only crdtng u w/3 runs allwd bc of DDJ OMFG

by Matt Klaassen on Dec 9, 2008 6:59 PM EST up reply actions  

It's not about fielders

Cecil, Prince, or otherwise. It’s about the reliability of the stat. If the stat isn’t reliable, then it’s not something that the pitcher has control over. Why ding him for something that’s dumb luck?

http://mvn.com/mlb-stats

by pizzacutter on Dec 9, 2008 6:24 PM EST reply actions  

Because you strip out luck via regression later?

I don’t understand the complaint – removing HR/BIA moves tRA further into luck neutral territory, which is something it explicitly does not try to do.

Luck is a part of past value. Fielding should not be a part of past value for pitchers.

by Graham MacAree on Dec 9, 2008 6:29 PM EST up reply actions  

Ahh..

OK, I’m acting on my bias of wanting everything in the world to be luck neutral. Fair enough.

http://mvn.com/mlb-stats

by pizzacutter on Dec 9, 2008 7:13 PM EST up reply actions  

Just to clarify, how do you feel about PZR, Pizza (and Graham and whomever)?

PZR is simply UZR from the pitcher’s perspective. On every batted ball, it assigns a run-value based on how often it gets turned into an out. A pitcher gets credit for that run-value, no matter whether the fielder makes the play or not. (In UZR, the run-value is the starting point, and a fielder gets credit for making or not making the play).

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Dec 9, 2008 6:34 PM EST up reply actions  

It's a run-value.

The poor wording is me.

Beyond the Boxscore // Calling BJ Upton lazy is lazy.

by Sky Kalkman on Dec 9, 2008 6:50 PM EST up reply actions  

Then yes I like it

It sounds like tRA with better data inputs.

by Graham MacAree on Dec 9, 2008 6:54 PM EST up reply actions  

Yeah it does have the same problem...

and I would have the same objection. Seems like a decent idea otherwise.

http://mvn.com/mlb-stats

by pizzacutter on Dec 9, 2008 11:52 PM EST up reply actions  

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Why be a member?

FanPosts

Community blog posts and discussion.

Recent FanPosts

Small
Free Agent Compensation
Img_0001_small
Value of Various Plate Approaches
Strike_three2_small
Effect of Foul Area on Strikeouts: AL 1954-68: Erratum
Small
Baseball on a stick
Small
Player Evaluating Statistic
Baseball_small
Rays Outfield: Cheap but Extremely Productive
Small
A new xBABIP
Small
Jack Morris "pitching to the score"
Strike_three2_small
Foul Area and Differences in SO: AL vs NL
Baseball_small
Is there a Kuroda and Oswalt Alternative?

+ New FanPost All FanPosts >

Follow us on Facebook!

Follow us on Twitter!

SaberGraphics

MLB Daily Dish

Get the latest MLB Trade Rumors, Transactions, and News at MLB Daily Dish!


Managing Editor:

Jbopp-kc_small Justin Bopp

Columnists:

Adam_small adarowski

Dme_small Satchel Price

Closeup4_small J-Doug

Carlosicon_small Julian Levine

Billy_and_daddy_4th_of_july_small Bill Petti

Featuring:

Dayton_small Jeff Zimmerman

12475953_small Jacob Peterson

Picture-6_small Chris St. John

Btbpro_small Dave Gershman

229331_10150183361996591_674441590_6760167_6637860_n3_small Lewie Pollis

Img_3830_small David Fung