Beyond the Box Score: An SB Nation Community

Navigation: Jump to content areas:


Sports blogs for fans, by fans.
Around SBN: Spencer Hall's Sports Meme Power Rankings

Ricky Nolasco: 4 WAR or 1 WAR?


This is a question I asked on my own blog, but I thought it'd get decent play here, and it addresses an interesting about pitcher WAR. Also, I have author privilege here, but I haven't written a damn useful thing on BtB for some time now, so I'd like to contribute.

Now as most readers here know, there are two great places to find WAR for any player. One of them is FanGraphs, the other is Rally's historical WAR database. One of them is fast and constantly updated, the other goes is comprehensive throughout baseball history (hence the "historical" part). The two databases measure positions players in essentially the same fashion. Some of the inputs and rates are different, but you aren't generally going to see a stark contrast in any one player; most differences are going to be due to different inputs (UZR for FanGraphs vs. TotalZone for Rally's database, for example).

Pitcher WAR is also performed in a similar fashion, but pitcher runs are determined in a very different fashion. FanGraphs uses FIP, a defense-independent component statistic that everyone here knows about and needs no further explanation. Rally's database uses a pitcher's actual runs allowed and takes a prorated value for defensive runs based on the balls in play the pitcher allows. Now, both versions in general reach similar conclusions, as most pitchers face around average luck and timing and the two values end up similar. A difference of 1 WAR would not preclude me from using one or the other; it simply becomes a matter of taste/preference for the method.

Then comes the interesting case of Ricky Nolasco's 2009 season.

Star-divide

Here's the relevant information:

2009 Nolasco FIP/0.92: 3.64

2009 Nolasco tRA (StatCorner): 3.94

2009 Nolasco RA: 5.40

Here's how I did the calculations and what I got as a result. From Marlin Maniac:

I did both calculations using park factors provided by Patriot. For defense-independent statistics WAR, I averaged tRA from StatCorner and FanGraphs, then averaged that value with FIP/0.92 and stuck into Pythagenpat. If you checked out my MVP article, you saw a list of WAR for pitchers calculated using that; that list contains all pitchers with more than 4.0 WAR. Using that method of evaluation, I had Ricky at 3.8 WAR for the season, a very good total. I then calculated WAR using Rally's method, using team bUZR from FanGraphs as my defensive metric. Using that total, I got Ricky totaling 0.8 WAR on the season.

The difference between the two is a staggering 3 WAR. If you'd prefer, just use the FanGraphs total of 4.2 WAR instead for the component statistic, it's not particularly relevant. The key here is that the difference is huge, and it brings up my question:

Strictly in terms of production, was Ricky Nolasco a 4 WAR pitcher or a 1 WAR pitcher this season?

Presumably, both measures are defense-independent, though they are calculated in different ways. Rally gives credit/debit to the pitcher for his context/timing, while the linear weights models are context independent. If all we wanted to know was production, which one would be a better option? Based on how we treat hitters, my initial presumption would be to lean towards the context-neutral method, but pitchers should have a lot more control over their environment than hitters do. Should a different method be used that's somewhere in between either of these options?

This is not to be taken as an indictment against either methodology, I'm just interested in everyone's opinion on the topic. Vote and discuss accordingly.

Poll
What was Ricky Nolasco's production level this season?
4 WAR
67 votes
1 WAR
28 votes
Somewhere in between because we should use a different calculating method.
45 votes

140 votes | Poll has closed

0 recs  |  Comment 39 comments  |  Add comment |

Story-email Email Printer Print

Comments

Display:

This was my struggle

I spent a significant amount of time last winter trying to decide what the pitcher WAR formula for FanGraphs should look like. Eventually, I just came to the conclusion that there was no best way – there were just a bunch of different options with pros and cons.

I went with FIP for the reason you stated – it’s context neutral and measures pitchers and hitters the same way, without regarding to base/out context. I felt like having WAR for hitters that was context neutral and WAR for pitchers that was partially context dependent was not a road that FanGraphs should go down.

Eventually, we’ll have something like PZR, where we use a play-by-play metric to actually value the defensive contribution behind each pitcher and we can do better than any of the methods currently in use. But for now, it’s a preference thing, really.

by davidcameron on Oct 28, 2009 2:19 PM EDT reply actions   0 recs

You're probably right, Dave

and I’m happy to have any of these evaluations available. But when discrepancies like these come about, it makes me think if there is a better answer. I’d say I’d agree that if anything, that answer is PZR.

by SFiercex4 on Oct 28, 2009 3:02 PM EDT reply actions   0 recs

I say 4 WAR a la Fangraphs

For the simple reason that it is likely to be more predictive.

If I had to guess how Nolasco would do if you gave him identical priors and ran the season again, his expected value would be much closer to 4 WAR than 1 WAR.

by Tommy Bennett on Oct 28, 2009 4:11 PM EDT reply actions   0 recs

Tommy, I would absolutely agree with that it's more predictive

but does it tell the story that already happened better? That’s what I was thinking when I looked at both numbers. Did he put up four wins for the Marlins this season alone, or is it close to that smaller, other number?

by SFiercex4 on Oct 28, 2009 4:35 PM EDT up reply actions   0 recs

WAR is not a predictive stat

It is a what have I done stat.

For example for offensive players, a players can have the same wOBA, play in the same number of games, and play the same defense, but if one player bats higher in the lineup and is surrounded by better hitters his WAR value will be higher.

Jeff Zimmerman - Protecting the world from RBI's and Wins from my mom's guest house.

by Jeff Zimmerman (TucsonRoyal) on Oct 28, 2009 5:50 PM EDT up reply actions   0 recs

Wait, what?

No it won’t…will it? I was under the impression that wouldn’t be factored into WAR at all…that would increase the number of RBIs, but those aren’t included in the calculation for WAR…am I missing something here?

Brewers Baseball and other assorted nonsense (mostly the assorted nonsense) at my blog, What's a Tararrel?

by Lefti on Oct 28, 2009 5:54 PM EDT up reply actions   0 recs

Innings Pitched and PA will both be factored into WAR.

Now, some people project players using 600 or 700 at bats (example at driveline). If a player would be a +20 UZR defender and hit for wOBA only plays a few time in the season (Jose Reyes), they won’t accumulate any WAR.

Jeff Zimmerman - Protecting the world from RBI's and Wins from my mom's guest house.

by Jeff Zimmerman (TucsonRoyal) on Oct 28, 2009 6:24 PM EDT up reply actions   0 recs

Just innings/PA’s isn’t adequate for describing their total contributions, though, in terms of “what have I done”. Two players get an equal number of PA’s but one is more “clutch” than the other over that period, they’ll have the same WAR even though the more “clutch” one directly lead to more runs being scored for his team. If you’re using WAR strictly to describe the past, you’d be missing some important elements that have a significant effect on W-L. I like WAR for what it is, but I don’t think you can fully call it a “what have I done stat”…

by Missing Barry on Oct 30, 2009 3:36 PM EDT up reply actions   0 recs

I actually have concerns about what WAR does

It’s not a value stat, because it completely ignores timing, nor is it a “true talent level” stat because it doesn’t incorporate past seasons. It’s kinda of a weird hybrid, although very nicely presented and quite useful for blog discussion purposes.

by vivaelpujols on Oct 30, 2009 11:49 PM EDT up reply actions   0 recs

What is timing, though?

In other words, what is the difference between a single with the bases empty and a single with a runner on third?

In each case the batter performs similarly – it is the batters who preceed him who perform differently. By rewarding a batter for “timing” without controlling for the quality of opportunities he recieves, are we truly measuring value, or are we misattributing value?

Or look upon it this way. In a typical system that accounts for “timing,” a single with the bases empty is credited as though an average batting performance follows. The next batter, who singles with a runner on first, is credited for having performed well with runners on.

But why should we treat the arrow of causation as only going in one direction? Should not the first batter be credited for his performance, given the performance of the second batter above that of an average batter?

What is typically referred to as “timing” or “clutch” and so forth is based upon a narrative idea, one which is compelling but one which does not necessarily tell us anything about real value in baseball.

WAR (at least as constituted – one could certainly swap out RE24 or WPA or whatever one likes for the batting component) is set to measure a batter’s performance in isolation, ignoring the effect of his teammates. This may not be the only definition of retrospective value, but it certainly is a valid one.

by cwyers on Oct 31, 2009 1:34 AM EDT up reply actions   0 recs

That's a great point about the arrow of causation

I never thought about that, and for hitters it’s perfectly valid.

However, for pitchers, it is different matter, I think. They are responsible for creating their own run environments on each play. I can’t think of a way to measure that though. tRA/24 would have the same problems. If a pitcher on a bad defense was put in more high run scoring situations, his tRA/24 would be higher than a pitcher who’s defense was better.

I still don’t think WAR should be used for valuing the year end production (IE, how much they actually helped their team win ballgames) of pitchers, and to some degree, hitters.

by vivaelpujols on Oct 31, 2009 2:11 AM EDT up reply actions   0 recs

I share some of your concerns for pitchers, VEP

For hitters, I’m far more confident, for reasons stated by Colin.

After reading all this discussion, both here and on The Book blog, I’m still on the fence for pitchers. It may still have something to do with removing defense as well, as Tango posed an excellent question regarding timing and how defense-independence could screw with that.

by SFiercex4 on Oct 31, 2009 5:22 PM EDT up reply actions   0 recs

WPA/LI incorporates "hitting to the situation"

but doesn’t give more credit to players who get higher leverage situations.

by lookatthosetwins on Nov 2, 2009 4:27 PM EST up reply actions   0 recs

WAR is not a "it is what I have done stat" either

Not for pitchers at least. It’s a “how well I would have pitches if all timing and BABIP related effects were taken out of the equation”. It removes a lot of noise, but also a decent ammount of skill. Besides, skill shouldn’t matter much towards production. If two players put up this line: 1 IP, 1 HR, 1 BB, 1 K; however, one of them allows the home run after the walk and the other one allows it before, they obviously weren’t equal pitchers in terms of production.

by vivaelpujols on Oct 28, 2009 7:17 PM EDT up reply actions   0 recs

I was thinking that as well

Again, that gets to the heart of the question, I suppose.

by SFiercex4 on Oct 28, 2009 7:43 PM EDT up reply actions   0 recs

But WAR for pitchers does measue innings

In the NL a pitcher may get the hook if the team is down, even though they may still be pitching good, if they are up to bat they may be pulled for a pinch hitter.

I guess playing time is the big unknown for hitters and pitchers.

Jeff Zimmerman - Protecting the world from RBI's and Wins from my mom's guest house.

by Jeff Zimmerman (TucsonRoyal) on Oct 28, 2009 8:35 PM EDT up reply actions   0 recs

Honestly, I wouldn’t mind just doing it two ways (context + context neutral) and listing them both. It’s not that either is “better” than the other, it’s just they describe different things and have different pros and cons as others noted. Who’s the better player? Well, a context neutral stat is better to tell us that. Who was actually more responsible for the end results over the season? Without context, we can’t accurately answer that question because context factors like luck play such a big role, so again, it just depends on what you want to know.

So I’d advocate using ttwo methods and posting the results for both, though I’m sure there are difficulties presenting that clearly and what not…

by Missing Barry on Oct 28, 2009 4:34 PM EDT reply actions   0 recs

The issue is that (IIRC) Rally computes TotalZone for PITCHER, not for TEAM. So using team bUZR is not going to capture how Rally does it.

BABIP for Nolasco was .322. He allowed 561 BIP. League average BABIP was .298. So:

561*(.322-.298)=13.46

To convert BIP to runs, multiply by .7, so 9.42.

111/185*9 = 5.4 RA

Adjust by BABIP to get adjusted RA:

(111-9.42)/185*9 = 4.94 RA

I don’t know what that gives you (recall that Rally uses a different pitcher rep-level than FG as well).

by cwyers on Oct 28, 2009 5:12 PM EDT reply actions   0 recs

That yields something like 1.4 WAR

Thanks for the correction Colin, I was not aware of that. Nevertheless, I suppose the point still stands, though with less “oomph.” I was using the same replacement level for each one just to keep all other inputs the same.

by SFiercex4 on Oct 28, 2009 7:42 PM EDT up reply actions   0 recs

Not really

“The issue is that (IIRC) Rally computes TotalZone for PITCHER, not for TEAM. So using team bUZR is not going to capture how Rally does it.”

I do it for team. Then, if a team is +50 TZ, and a pitcher pitches 10% of his team BIP, he’s considered to have had +5 defensive support. This will be the same whether his BABIP is .260 or .320.

The HK-47 hitting droid is the finest line drive machine ever built

by RallyMonkey5 on Oct 28, 2009 10:12 PM EDT reply actions   0 recs

Thanks for the clarification Rally.

It makes sense, but why did you choose to do it this way over the way Colin suggests?

by SFiercex4 on Oct 28, 2009 11:20 PM EDT up reply actions   0 recs

Defensive sample size

We’ve got people who believe a player’s career defensive rating (or at least last 3-4 years) tells you more about how they played in 2009 than their actual 2009 defense does. Tango and MGL are in this camp. I don’t think we can say, based on 150 innings, that a defense played better behind one pitcher or another.

If one allows a .260 bip and another a .340, then I think it more likely that pitcher 2 allowed a lot more hard hit balls. That result is partly from skill, and probably from some bad luck, but either way the blame goes to the pitcher in a “what have you done” metric.

The HK-47 hitting droid is the finest line drive machine ever built

by RallyMonkey5 on Oct 29, 2009 8:01 AM EDT via mobile up reply actions   0 recs

I don't think that's right at all.

The sampling problems with measures like UZR largely have to deal with apportioning fielding chances among players. When it comes to measuring team defense, we don’t have those problems. (It’s a split-credit thing – we have a pretty good idea of how to measure team defense, the problem is figuring out how to divide that credit up between individual players.)

As far as “hard-hit balls” – looking at the data, hardness of hit ball is not a major impact on how it is fielded (assuming you control for everything else). A ball that’s REALLY hard hit typically leaves the park altogether. The big impacts are the path the ball takes – whether it’s a ground ball or an air ball and the spray of the ball.

Honestly, I’d rather see no defensive adjustment than the defensive adjustment you’re using, because of the risk of a double penalty – for instance, a pitcher on a good defensive team who recieves bad defensive support on days when he pitches. That player’s observed RA is going to be hit twice – first from the effects of his defense, and secondly from the defensive adjustment that assumes his defensive support was good when it really wasn’t.

by cwyers on Oct 29, 2009 12:22 PM EDT up reply actions   0 recs

Defense behind Pitcher

I agree. This would be like assigning a pitcher’s run support based on the team average runs per game instead of using the runs per game when the pitcher actually was the starting pitcher. As we know, those two numbers can sometimes differ significantly, and I would think defense would be the same way.

KJOK

by KJOK on Oct 30, 2009 4:45 PM EDT up reply actions   0 recs

Wouldn't it make more sense to prorate defense based on batted balls?

For example, Nolasco, as a flyball pitcher, should not be rewarded for the Marlins’ bad infield defense, because he would benefit disproportionately from their good OF defense (This is a hypothetical example; I don’t know whether the Marlins had relatively better IF or OF defense)
Also, since you have TotalZone on each play, can’t you calculate defense the way PZR does it?

by Alex Krolewski on Oct 28, 2009 11:46 PM EDT up reply actions   0 recs

Probably could

But it would take a lot of work from where I have things set up now.

The HK-47 hitting droid is the finest line drive machine ever built

by RallyMonkey5 on Oct 29, 2009 8:03 AM EDT via mobile up reply actions   0 recs

Jack Z is good, but Beinfest won't let him go until he absolutely has to

My suspicion is that Beinfest knows he was good and will hold on to him until arbitration is over. Maybe next year, though.

by SFiercex4 on Oct 28, 2009 11:37 PM EDT up reply actions   0 recs

I do think Ricky Nolasco is a bit better than J.J. Hardy

And that’s for whom Dave wanted to trade Morrow. I’m not buying it!

by SFiercex4 on Oct 29, 2009 8:28 AM EDT up reply actions   0 recs

Darn! Admittedly, I don't know how smart the Marlins GM is in terms of understanding value and sabermetrics and whatnot.

The only thing I know about the Marlins FO is that their owner made some disparaging comments about Ichiro :(

by Taylor H on Oct 29, 2009 12:06 PM EDT up reply actions   0 recs

I don't think he's that great, but he does seem to have a good scouting eye in general

And with the way Nolasco “bounced back” in the second half, I think most people in the front office think he’s the plus pitcher he probably is, rather than the guy who had a 9.08 ERA before being demoted.

by SFiercex4 on Oct 29, 2009 1:20 PM EDT up reply actions   0 recs


User Tools

We use numbers and stuff.
Community Guidelines
Why be a member?
Start posting on Beyond the Box Score »

Join SB Nation and dive into communities focused on all your favorite teams.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Small
PZR-based Win Values 2001-2006

Recent FanPosts

Small
Determining Batted Ball Rates using Pitch Type and Location
Small
a new xBABIP calculator
Img587561916661595
Top 15 high school MLB draft prospects
Small
The "30 parks on a budget" challenge
Sunflower_small
World Series Simulation, Game #6
Small
JT20 Dynasty League
E52205a2_small
New Look
Sth70021_small
Exploring Hit f/x, Albeit Badly
Redcap_small
Ricky Nolasco: 4 WAR or 1 WAR?

+ New FanPost All FanPosts >

FanShots

Quick hits of video, photos, quotes, chats, links and lists that you find around the web.

Recent FanShots

Primer on BaseRuns
Cool Baseball Infographics
ESPN's Jerry Crasnick on defensive metrics
I’m also a follower, since Brian Bannister’s on our team, of sabermetric st...
Top Ten Baseball-Reference.com's Sponsorships
Primer on Linear Weights
JC Bradbury on "Hot Stove Myths"
Everyone Should Learn to Throw a Cutter
Criminals of WAR
Ten statisticians you should know about

+ New FanShot All FanShots >

BtB on Twitter

Main Feed: @BtBScore

Tommy B: @tommy_bennett
Sky: @BtB_Sky
Dan: @dturkenk
Harry: @harrypav
Jinaz: @jinazreds
Jack: @jh_moore
Erik: @Erik_Manning
Tommy R: @trancel
Justin: @justinbopp

Subscribe to BtB via Email

Enter your email address:

Delivered by FeedBurner

BtB Goes Social


Managers

Nando_small R.J. Anderson

Limes_125_small Sky Kalkman

E52205a2_small Tommy Bennett

Editors

Face_small Harry Pavlidis

Rawlings_baseball_bigger_small Dan Turkenkopf

770insig_small Jeff Zimmerman (TucsonRoyal)

Aviles_small Justin Bopp

Authors

Banny_small erik

Raysring1_small Tommy Rancel

Jinaz-reds-avatar_small JinAZ

Jmlogo_small Jack Moore

1753738656_110919ebe9_o_small vivaelpujols

1_small Graham

Baseball_small Mike Rogers

Redcap_small SFiercex4

Small Patrick Clark

Walter_album_small Walter Fulbright