Baseball Databank Data Dump, I: FIP Constants and Runs Per Win

Hi. I'm Matt. I write here (occasionally) now. 

Perhaps I could have picked something a bit more exciting for my first post, but I kept missing stuff, getting busy with other things, and so on. But I think this might be of utility to some people -- I know there were times I've wished I could easily access stuff like this. 

Like many people, I originally got into SQL databases for baseball thanks to a few posts by Colin Wyers at the sadly-departed Statistically Speaking blog (Colin has since posted new versions of his introductions at The Hardball Times.)

This isn't about SQL, though. It's about putting some data out there for people who want to use FIP scaled to a particular season or need a runs-to-win conversion but don't want to mess with the database stuff themselves. I'll also publish the queries down below for those curious (they are very simple), which I came up with for use in MySQl and the pitching table in the freely available Baseball Databank.


FIP Scaled Historically

You probably already know about FIP (Fielding Indepedent Pitching),  created by Tom Tango. (If not, click here for a good introduction), the simple and beautiful stat that gives us an idea of how well a pitcher performed independently of batted ball luck and the fielders behind him by focusing on things he can control. The basic formula is:

((13*HR)+(3*(BB+HBP-IBB))-(2*K))/IP+constant

Where the "constant" is usually around 3.2, but is adjusted for each season/entity to that the leagues season/entity ERA is the same as the season/entity FIP. Obviously, it's a pain to calculate each season individually, but that's why computers were invented. So I came up with an easy query for that. Not only that, but if you're like me and many others, you hate the "earned run" rule for various reasons I can't get into here. Not only that, but you realize that when, say, calculating Wins Above Replacement, using a stat scaled to ERA undervalues a pitcher relative to position players. So, instead of scaling FIP to ERA, why not to RA? Well, I did both with this query. And here are the results for easy bookmarking:

Historical FIP Scales, 1871-2009

The numbers in the "FIP-ERA" and "FIP-RA" columns are the "constants" as listed above that you need to use to scale them to the ERA/RA for the season. I have two separate sheets, one that groups all the major leagues by season, and the other that groups by season and league (i.e., separates AL and NL).

 

Runs Per Win

You can read the link in the heading if you aren't sure what this means. In the contemporary baseball era, you're pretty safe assuming that 10 runs = one win. But sometimes you want to be more precise, so I've come up with a table for that, too, comparing the results of three different runs per win estimators

  1. Simple Runs Per Game (or lgRA times two).
  2. Tom Tango's simple formula: lgRA*1.5+3
  3. Patriot's PythagenPat version: (league Runs Per Game^(1-z)) times two, where z is between 0.27 and 0.29. I use 0.287 because that's what I read originally and I'm too lazy to figure out if I should change it.

Here is the table:

Runs Per Win, 1871-2009

As noted in the first link in this section, the Pythagorean win expectation is the heart of the runs-to-win conversion, and since PythagenPat is the most accurate win expectation estimator across all contexts, so will its run to win conversion be the most accurate (although, again, you're pretty safe just going with 10 in a pinch). But I actually still use Tango's version in a lot of my SQL's because it's easier to work with quickly, and I'm not sure the loss in precision is worth all the effort, although I will eventually change all my SQLs over to PythagenPat. And, as you can see from the table, in the contemporary era there really isn't all that much of a difference, and even RPG isn't that far off.

I hope this is useful to someone out there, and if not, well, I hope I'm more interesting next time. I'll spare you the SQLs on the front page, but I do have them after the break for those that are interested.

These are for MySQL with the Baseball Databank. Both of these are the "group by league" versions, but if you take out the lgID in the "select" and "group" lines, it will group by all the major leagues of that year.

FIP Constants

SELECT p.yearID, p.lgid , SUM(COALESCE(IPOuts,0)/3) AS IP ,((SUM(COALESCE(ER,0))/SUM(COALESCE(IPOuts,0)/3))*9)AS lgERA , ,((SUM(COALESCE(ER,0))/SUM(COALESCE(IPOuts,0)/3))*9) - ((SUM(COALESCE(HR,0)*13)+(SUM(COALESCE(BB,0))+SUM(COALESCE(HBP,0))- SUM(COALESCE(IBB,0)))*3-SUM(COALESCE(SO,0))*2)/(SUM(COALESCE(IPOuts,0))/3)) AS FIPERScale , ,((SUM(COALESCE(R,0))/SUM(COALESCE(IPOuts,0)/3))*9) AS lgRA , ,((SUM(COALESCE(R,0))/SUM(COALESCE(IPOuts,0)/3))*9) - ((SUM(COALESCE(HR,0)*13)+(SUM(COALESCE(BB,0))+SUM(COALESCE(HBP,0))- SUM(COALESCE(IBB,0)))*3-SUM(COALESCE(SO,0))*2)/(SUM(COALESCE(IPOuts,0))/3)) AS FIPRAScale FROM bdb.pitching p GROUP BY p.yearID, p.lgid ORDER BY p.yearID DESC

Runs Per Win

 

SELECT yearID, lgID

, (((SUM(R)/SUM(IPOuts/27))*1.5)+3) AS RPWTango

, (SUM(R)/SUM(IPOuts/27))*2 AS RPG

, 2*(POW(SUM(R)/SUM(IPOuts/27)*2, 0.713)) PythPat

FROM bdb.pitching

GROUP BY yearID, lgID

ORDER BY yearID DESC

;

X
Log In Sign Up

forgot?
Log In Sign Up

Please choose a new SB Nation username and password

As part of the new SB Nation launch, prior users will need to choose a permanent username, along with a new password.

Your username will be used to login to SB Nation going forward.

I already have a Vox Media account!

Verify Vox Media account

Please login to your Vox Media account. This account will be linked to your previously existing Eater account.

Please choose a new SB Nation username and password

As part of the new SB Nation launch, prior MT authors will need to choose a new username and password.

Your username will be used to login to SB Nation going forward.

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

By becoming a registered user, you are also agreeing to our Terms and confirming that you have read our Privacy Policy.

Join Beyond the Box Score

You must be a member of Beyond the Box Score to participate.

We have our own Community Guidelines at Beyond the Box Score. You should read them.

Join Beyond the Box Score

You must be a member of Beyond the Box Score to participate.

We have our own Community Guidelines at Beyond the Box Score. You should read them.

Spinner.vc97ec6e

Authenticating

Great!

Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.

tracking_pixel_9351_tracker