Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Trent Richardson Interviews Fellow Brown Brandon Weeden

On Fielding Independent Pitching, Wins Above Replacement, and Web Development

The truth about Wins Above Replacement (WAR) is that it's not really an official statistic. Rather, It is more of a general model, with multiple implementation coming from various sources.

Two of the most oft-cited implementations are FanGraphs' WAR and Rally's WAR (often cited as Sean Smith's WAR or BaseballProjection.com WAR—this is also the WAR used on Baseball-Reference.com). For position players, the implementation is similar, despite some key differences that can make certain players vary in value depending in the WAR source you use.

For example, Dustin Pedroia's 2008 MVP season was worth 6.6 WAR according to FanGraphs and 5.2 WAR according to Rally's WAR. What's the difference? The main difference is the defensive metric used. FanGraphs uses Ultimate Zone Rating (UZR), which had Pedroia worth 9.9 runs above average. Rally's WAR uses Total Zone which had Pedroia worth 1 run above average (range and double play combined).

The two metrics were in much better agreement in 2009, as FanGraphs had Pedroia at 5.0 WAR and Rally had Pedroia at 4.9 WAR. So far in 2010, Rally has Pedroia rated a bit higher (3.6 to 3.3). Total Zone has Pedroia already surpassing his career bests with the glove while FanGraphs is a bit more conservative.

Where these two WAR implementations deviate from each other even further is pitching WAR. FanGraphs is based on Fielding Independent Pitching (FIP) while Rally's WAR is based on runs allowed, which is then adjusted for defense and park factors.

I've noticed two main areas of focus in sabermetrics—projection of the future and analysis of the past. Personally, I'm far more interested in the much smaller camp—that which analyzes the past. It seems to me that each version of pitcher WAR should be used by different camps. FanGraphs' pitcher WAR is better geared toward future projection while Rally WAR is better for analyzing the past.

I get the value in FIP—you take walks, strikeouts, and homers and see how a pitcher would do if the defense was taken out of the equation. In many ways, it puts pitchers on a level playing field. But allow me to make an analagy between WAR and my profession: web development.

Star-divide

I'm a front-end web developer. What does that mean? Basically, I build the stuff on a web site that you see. If you're at all familiar with web devlopment, then you probably know the bane of our collective existence: Internet Explorer.

Internet Explorer is shitty defense. The most beautifully written markup and style sheets can be sent to Internet Explorer only to be blown to hell. We developers are then forced to expand our skills by learning to deal with the elements that surround us (in this case, a browser with enormous market share that rejects industry standards and does it's own thing), modify our approach, and deliver markup and style sheets that may not be written "by the book", but work in Internet Explorer.

Perfect, standards compliant front end code is Fielding Independent Pitching. If you write it well, you're a good developer. It should all just work. But it doesn't. But no pitcher whiffs 27 in every game, so defense has to come into play. No (public-facing) website can be built just for Firefox or WebKit. You have to deal with internet Explorer.

Have a great infield behind you? Throw more sinkers with men on base to get those double plays. Got fast outfielders who get good jumps? Don't be afraid to throw the high heat. Your guys can get to the gaps quickly. Pitchers can modify their basic approaches based on the external factors around them and that's why I'm uncomfortable analyzing past results solely on FIP. If I deliver a perfect site that fails in Internet Explorer, nobody is going to give me an 8.5 WAR season for my effort. FIP, and by extension Fangraphs WAR) says "in an ideal world, this is what we could reasonably expect to happen". That's what makes it great for projection. You can take that projection and adjust it for the anticipated environment.

But when talking about past performance, we know that it didn't occur in an ideal world. That's why we should start with what actually happened (runs allowed) as the baseline and then start adjusting for other factors.

FIP doesn't capture the ability to overcome a key error by your shortstop by inducing a timely grounder. It also doesn't capture the ability to diagnose and fix a guillotine bug (yes, this is the type of crap we deal with). But these skills are key to finishing the job and succeeding.

Rally for the past. Fangraphs for the future.

Comment 10 comments  |  3 recs  | 

Do you like this story?

Comments

Display:

Rec

This, to me, raises the question of what exactly is the point of FanGraphs WAR and pitching stats like xFIP, tRA, FIP, SIERA. Neither are good measures of retrospective value because they take at a lot more things than just defense, and they aren’t as good as projecting future performance as a projection system because they only take one year of data into account.

Can somebody give me a practical usage for FIP? I guess you could argue that it tries to isolate the skill involved in past performance, because it only uses stats that are very stable year to year, but again, I think a projection system that used multiple years of data would be better for isolating skill in past performance.

by vivaelpujols on Jun 25, 2010 3:00 PM EDT reply actions  

I think the practical usage is determining whether someone has the basic components to be good

Since FIP places heavy emphasis on the building blocks of pitching (Ks, BBs, and HR), it is probably great for identifying young or minor league pitchers who are pitching a lot better than their record or ERA might indicate. FIP is useful as an easily identifiable number that is a compilation of statistics that indicates how well a pitcher is doing at the most basic level.

The problem that I have with it (and the others you mention) is that it’s not just fielding-independent (which is a questionable goal in the first place), but it’s context-independent. Defense and who you’re pitching to matter, and pitchers constantly adjust to hitters (and vice versa). They might strike a guy out in one at bat, but realize that the hitter fouled off a hung curve and instead try a different approach in the next at bat.

It’s not at all rooted in the reality of the past; it’s rooted in how the designer of FIP thinks the past should look. The pre-conceived picture yields a stat that looks a lot more like user preference than reality.

by deacs on Jun 25, 2010 3:15 PM EDT up reply actions  

It's interesting that Dave Cameron made the opposite argument in a Fangraphs chat:
Dave Cameron:
The only difference between FIP and xFIP is the way HRs are treated. FIP uses HR rate as the variable in the formula, while xFIP uses flyball rate, under the presumption that variances in HR/FB rates are mostly luck and not true skill. FIP is a better representation of how many runs a pitcher should have given up in the past, while xFIP is a better representation of how many runs a pitcher will give up in the future. tERA is like FIP with more granular batted ball.

by deacs on Jun 25, 2010 3:15 PM EDT up reply actions  

He's concentrating on FIP vs xFIP

Sure, he’s right when it comes to those two statistics, but he also says “should” have given up when talking about FIP. Completely ridiculous example, but what happens if a guy has a .150 BABIP over the course of a season even though he has a LD% of 50% and ends up having an wOBA of .250 as a result? He “should” have done better, but he didn’t. That’s not reflected in wOBA and therefore not reflected in Fangraphs’ WAR calculation. Rally’s WAR calculation is more analogous to such an approach when it comes to pitching WAR.

by CajoleJuice on Jun 25, 2010 4:50 PM EDT up reply actions  

I agree with what you're saying

I see FIP as an indicator of talent, not an indication of actual success (based on results).

I know he’s talking about FIP vs. xFIP, but he still refers to FIP as a good retrospective tool, which it’s not. FIP is designed to look for a specific type of pitcher (essentially, a high strikeout pitcher who doesn’t allow homeruns). What it determines as good are usually very talented pitchers, but the actual results (and I’m not talking wins/losses) vary within the FIP rankings.

There needs to be another, broader way to package meaningful results into an ERA-type number (tERA doesn’t capture it either).

by deacs on Jun 25, 2010 7:15 PM EDT up reply actions  

This is incorrect
FIP is designed to look for a specific type of pitcher (essentially, a high strikeout pitcher who doesn’t allow homeruns).

FIP takes into account exactly 4 factors of pitching in the formula:

-number of strikeouts
-number of walks
-number of home runs
-number of balls in play

It then assumes a league average run value for each of those 4 things. Players who allow weaker contact on balls in play will be underrated by FIP, but the whole point of FIP is to remain agnostic on the value of balls in play.

FIP doesn’t have a bias towards high strikeout pitchers. You can still be a high strikeout pitcher and have a terrible FIP if the rest of your stats sucks (see Oliver Perez). It has a bias against pitchers who allow worse contact on balls in play and pitchers who pitch better with runners on base, and some other less important things.

by vivaelpujols on Jun 25, 2010 7:23 PM EDT up reply actions   1 recs

Yet, you can be a low strikeout pitcher and still be successful

For instance (prior to this year), Nick Blackburn.

I understand the components of FIP, but Ks are the only positive action factored that a pitching controls. What I was trying to say is that high Ks are a necessary condition for a good FIP, but is not a necessary condition for actual good results. Because they are only positive action controlled by the pitcher factored into FIP, FIP is not a great reflection of how a pitcher actually pitched (given that there are numerous other things a pitcher controls or can greatly influence: quality of contact, type of hit, double play, etc). It is, as is noted here, handy for seeing how a pitcher might pitch going forward.

FIP or xFIP say little about what actually happened on the field, and shouldn’t be used for that purpose. Anything that totally divorces a pitcher’s success from his fielders is not going to be useful as a retrospective tool.

by deacs on Jun 27, 2010 9:12 PM EDT up reply actions  

Pitcher's wOBA

I’ve tried creating win values based on pitcher’s wOBA allowed. I think this would be a great way to show how well a pitcher actually did in the past. My biggest problem was finding replacement levels for starters vs. relievers.

HS team nickname: Redmen, College team nickname: Warriors, Amateur team nickname: Chiefs, Favorite MLB team: Braves. Holy political incorrectness...

by LeeTro on Jun 26, 2010 2:04 PM EDT reply actions  

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Why be a member?

Follow us on Facebook!

Follow us on Twitter!

SaberGraphics

Yahoo_full_count

MLB Daily Dish

Get the latest MLB Trade Rumors, Transactions, and News at MLB Daily Dish!


Managing Editor:

Jbopp-kc_small Justin Bopp

Columnists:

Adam_small adarowski

Dme_small Satchel Price

Closeup4_small J-Doug

Carlosicon_small Julian Levine

Billy_and_daddy_4th_of_july_small Bill Petti

Featuring:

Dayton_small Jeff Zimmerman

12475953_small Jacob Peterson

Recent_pic_pg_small Patrick Gordon

Btbpro_small Dave Gershman

Me_small Bryan Grosnick

229331_10150183361996591_674441590_6760167_6637860_n3_small Lewie Pollis

Img_3830_small David Fung

30472_1481067225243_1190689185_1381415_997334_n_small Glenn DuPaul

1mnvxku7_small joshuaworn

Set_small MattFilippi18

Photo0011_small Nathaniel Stoltz