clock menu more-arrow no yes

Filed under:

A Quick Primer

New, 1 comment

A couple of readers have asked me for some quick background in some areas of "sabermetrics," and, with the User Diary, I figured that I'd post some information on some of the stats that you'll see most often here.

VORP (Value Over Replacement Player) - This is a Baseball Prospectus stat that analyzes the number of runs that a player contributes to his team over that which a "replacement level" player could produce. A "replacement level" player is essentially someone who pretty much sucks. He puts up an OPS at about 70 points lower than the league's positional average. It's best use is relative to other players at his position or team and is a good measure to evaluate how valuable a player is.

DIPS (Defense Independent Pitching Statistics) - In 2000, Voros McCracken, current industry legend, did a study and determined that pitchers have little or no control over balls that are hit in play. So, essentially, pitchers have the most control over walks, strikeouts, home runs allowed, and ground balls and fly balls given up. The rest? Luck or defense.

The best metaphor I can think of is this:

  • Imagine a defense of 8 players who refused to move when the ball was hit. Pretty much any ball that's hit in play will be a hit. The pitcher, who is playing with the worst possible defense, needs to strike people out and avoid walking people and giving up homers, because that's all he can do to prevent runs himself.
  • Then imagine a defense full of 8 guys who run at the speed of light and catch everything. NOTHING will fall in. A pitcher with this defense is much luckier than the guy who got stuck with the first one. He can afford to let balls be hit into play, but he's not responsible for his success, then.
Essentially, DIPS (seen as dERA) factors out the "luck" and keeps it to what the pitcher can best control. Comparing it to ERA can show you which pitchers were unlucky / lucky and will probably suffer a drop-off. Al Leiter is the best example of this from last year. His dERA was at 4.79, I think, and his regular ERA was 3.20. There's a major inconsistency there.

Anything/PA or BF - Just another way to calculate a rate stat.

WARP (Wins Above Replacement Level) - Similar to VORP but in terms of wins rather than runs.

.xxx/.xxx/.xxx - This is the easiest way to express a hitter's capabilities. It's Batting Average/On Base Percentage/Slugging Percentage. You want the last two to be highest. Both OBP and SLG have shown to correlate with run scoring much more than batting average, so it is good to think of players in this context. RBIs and runs scored are too dependent on the team.

EqA - (H + TB + 1.5*(BB + HBP + SB) + SH + SF) / (AB + BB + HBP + SH + SF + CS + SB), then normalized to park and league. EqA follows a batting average scale; .260 is average, .300 is very good.

Isolated Power - The best measurement to determine a player's power, it accounts for extra base hits: (SLG-AVG) divided by AB

PECOTA - Baseball Prospectus's projection methodology. It uses a massive database of past seasons and looks for comparable players to predict a player's performance. PECOTA data is most central right before and early in a season to try and predict what a player will do.

Pythagenport record - Bill James created the "Pythagorean formula" of baseball, which is a way to relate run scoring to winning percentage.

RS^2 divided by (RS^2+RA^2)

Pythagenport record doesn't use a fixed exponent of 2; rather, it determines the exponent based on the "run scoring environment" (in some years, offense is heavier than pitching, and vice versa).

X = .45 + 1.5 * log10 ((rs+ra) / g)

Secondary average - Another stat to tell us what we already know. It's a good way to see how worthwhile a player's batting average has been. If a player is hitting .270 with a .200 SecA, it's a "weak" or "fairly worthless" .270.

The formula is (2B + 2*3B + 3*HR + BB + SB - CS) / AB.

Break-even point - The point at which the stolen base is a worthwhile play. A tenet in sabermetrics is to preserve outs, and stolen bases tend to be a bad play if you don't convert on it at least 70% of the time. This is important. In this respect, a runner who steals 50 bases but gets caught 25 times is actually hurting his team more than he is helping it. This is one reason why saber-folks love Carlos Beltran so much; his stolen base percentages have been very beneficial to run scoring.

BABIP - Batting Average on Balls in Play. Usually used for pitchers. BABIP is a foundation in DIPS. Pitchers with very high BABIPs are usually experiencing poor defense / luck, and pitchers with very low BABIPs are usually experiencing very good defense / luck.

FIP - An easier to calculate approximation of DIPS, it is ((13*HR) + (3*BB) - (2*K) / IP) + 3.2, to get a fair approximation of what the pitcher's ERA would be with an average defense / luck.

Sample size - The paramount concern in looking at statistics (in baseball and other areas). If a sample is too small, there is a high likelihood that the stats are skewed / altered by random chance.

I'm sure there are others... I'll keep editing this as time goes on, when I think of other stats that might need some clarifying.