I don't intend to give every metric it's own article, though it may certainly seem that way after both wOBA and FIP seem to be getting special attention this week. But calculating some of the minus metrics can get a bit complicated for beginners. At least, it certainly took me more than a few unsuccessful queries to get it right and I thought I might try and save you some of the aggravation.

As with calculating wOBA, the Fangraphs "GUTS" download will prove extremely helpful in generating single-season FIP totals for your database. Colin pointed out in Monday's edition of Saberizing that you can certainly calculate both wOBA weights and yearly FIP constants yourself using methods outlined at BASQL and at The Book Blog, but I've found that generally people prefer their calculations to match those at Fangraphs.

I'm sure we all know the FIP formula by heart at this point:

C= (HR*13+(BB+HBP)*3-K*2)/IP

Which in SQL will look like this:

SELECT CONCAT(m.namefirst," ",m.namelast) as NAME

, p.yearID

, ROUND(g.cFIP+(sum(p.HR)*13+(sum(p.BB)+sum(p.HBP))*3-sum(p.so)*2)/(sum(p.ipouts)/3),2) as FIP

, ROUND((sum(p.ipouts)/3),1) as IP

FROM pitching p, master m, GUTS g

WHERE p.playerID = m.playerID

AND p.yearID = 2012

AND g.season = p.yearID

GROUP BY p.playerID, yearID

HAVING IP > 150

ORDER BY FIP asc

LIMIT 10;

If your SQL is telling you there is an error like "BIGINT UNSIGNED value is out of range," then your integers are 'unsigned' (meaning they won't convert into negative values) and you may need to re-write your code like this:

, ROUND(g.cFIP+(CAST(SUM(p.hr) AS SIGNED)*13+CAST(SUM(p.hbp) AS SIGNED)*3+CAST(SUM(p.bb) AS SIGNED)*3-CAST(SUM(p.ibb) AS SIGNED)-CAST(SUM(p.so) AS SIGNED)*2)/

(SUM(p.ipouts)/3),2) as FIP

Either way, your results should match up exactly with Fangraphs FIP leaderboard for 2012. I always then prefer to remove IBB from the formula, even though Fangraphs does not, but this is entirely up to you, obviously.

### FIP- / ERA-

Once you've got the formula down, then, getting the minus metrics shouldn't be too difficult. Although, fair warning here: You may have difficulty getting your minus metrics to line up perfectly with Fangraphs unless you download their Park Factors, which in their present format are a bit of a pain for those planning to download multiple seasons. If someone wants to make a giant file combining each season and share with the rest of the group, that would be wonderful. (I made it to 1980 and then nearly died of boredom).

It also depends on whether you want to compare to the league average or the MLB average. Typically your numbers won't be off more than a point or two, but at least you'll gain an appreciation for the true impact park factors have on our measurements. I'll show you what I do and you can adjust yours accordingly.

First you need to create a "League-Average" table and index that bad boy:

CREATE TABLE lg_averages AS

SELECT p.yearID, p.lgID

, ROUND(SUM(p.ER)/(SUM(p.ipouts/3))*9,2) as lgERA

, ROUND(g.cFIP+(SUM(p.HR)*13+(SUM(p.BB)+SUM(p.HBP))*3-SUM(p.so)*2)/(SUM(p.ipouts)/3),2) as lgFIP

FROM pitching p, GUTS g

WHERE p.yearID = g.season

GROUP BY yearID, lgID;CREATE INDEX year_idx ON lg_averages (yearID);

Especially for recent seasons, I prefer to divide up the averages by AL/NL due to the fairly significant differences between the two leagues at the moment. (I believe Baseball-Reference does not do this for their ERA+ metric, however). This is achieved, you may have noticed, by adding "lgID" to the GROUP BY clause. If you prefer to create an "mlb_averages" table, simply remove "lgID" from the GROUP BY clause.

Then return to your original single-season FIP leaders query and divide the FIP formula by the lgFIP from our lg_averages table:

((g.cFIP+(SUM(p.HR)*13+(SUM(p.BB)+SUM(p.HBP))*3-SUM(p.so)*2)/(SUM(p.ipouts)/3)) / l.lgFIP

Now it's time to add your park factor. The Lahman database actually comes with a set of park factors that are certainly more than adequate for our purposes. You can find them in the "teams" table labeled "ppf" for "Pitcher's Park Factor". The ppf is designed to contribute towards ERA+ measurements, so we have to remember to *flip it upside down,* so to speak. You can go about this a number of ways, but for me it works best to subtract the ppf from 200. So the 2012 ppf for Seattle changes from 91 to 109 (or 200-91 = 109).

So in the end you have this:

, ROUND(((g.cFIP+(SUM(p.HR)*13+(SUM(p.BB)+SUM(p.HBP))*3-SUM(p.so)*2)/(SUM(p.ipouts)/3))/l.lgFIP*(200-t.ppf)),0) as FIP_minus

Remember now, you've added two more tables to the query, so you will need to define something on which they can relate. You'll need to join the *pitching, GUTS, lg_averages, *and *teams* tables all on 'yearID'/'season', join *teams* and *pitching* on 'teamID' and *lg_averages* and *pitching* on 'lgID'. Your final product should look something like this:

SELECT

CONCAT(m.namefirst," ",m.namelast) as NAME

, p.yearID, p.lgID

, ROUND(SUM(p.ER)/(SUM(p.ipouts/3))*9,2) as ERA

, ROUND(SUM(p.ER)/(SUM(p.ipouts/3))*9/l.lgERA*(200-t.ppf),0) as ERA_minus, ROUND(g.cFIP+(sum(p.HR)*13+(sum(p.BB)+sum(p.HBP))*3-sum(p.so)*2)/(sum(p.ipouts)/3),2) as FIP

, ROUND(((g.cFIP+(sum(p.HR)*13+(sum(p.BB)+sum(p.HBP))*3-sum(p.so)*2)/(sum(p.ipouts)/3))/l.lgFIP*(200-t.ppf)),0) as FIP_minus

, ROUND((sum(p.ipouts)/3),1) as IP

, t.PPF

FROM pitching p, master m, GUTS g, lg_averages l, teams t

WHERE p.playerID = m.playerID

AND p.yearID = l.yearID and p.yearID = t.yearID AND g.season = p.yearID AND p.yearID = 2012

AND p.teamID = t.teamID

AND p.lgID = t.lgID AND p.lgID = l.lgID

GROUP BY p.playerID, yearID, lgID

HAVING IP > 150

ORDER BY FIP asc

LIMIT 10;

Returning this group of names:

NAME | yearID | lgID | FIP | ERA | ERA_minus | FIP_minus | IP | PPF |
---|---|---|---|---|---|---|---|---|

Gio Gonzalez | 2012 | NL | 2.82 | 2.89 | 72 | 72 | 199.3 | 101 |

Stephen Strasburg | 2012 | NL | 2.83 | 3.16 | 79 | 72 | 159.3 | 101 |

Felix Hernandez | 2012 | AL | 2.84 | 3.06 | 82 | 75 | 232.0 | 91 |

Clayton Kershaw | 2012 | NL | 2.89 | 2.53 | 67 | 77 | 227.7 | 96 |

Justin Verlander | 2012 | AL | 2.94 | 2.64 | 63 | 69 | 238.3 | 103 |

David Price | 2012 | AL | 3.05 | 2.56 | 67 | 79 | 211.0 | 93 |

Adam Wainwright | 2012 | NL | 3.10 | 3.94 | 103 | 82 | 198.7 | 97 |

Cliff Lee | 2012 | NL | 3.13 | 3.16 | 79 | 79 | 211.0 | 101 |

Wade Miley | 2012 | NL | 3.15 | 3.33 | 79 | 76 | 194.7 | 106 |

R.A. Dickey | 2012 | NL | 3.27 | 2.73 | 71 | 86 | 233.7 | 97 |

Now, before any of the SQL gurus out there get on my case, I should mention that this won't work for a player like Zack Greinke, who switched leagues in the middle of the season. We can look at how to handle cases like that at a later date. For now, though, I'm hoping you just get comfortable with the idea of making joins, creating lg-avg tables, and using park factors to expand your SQL game.

As always, don't be afraid to ask questions in the comments and the community at large and myself will be eager to assist you.

##### More from the Saberizing a Mac series:

*Thanks to Fangraphs and Sean Lahman for making all of our lives a little nerdier.*

*Follow @JDGentile on twitter.*