Daily Box Score 8/28: Burying OPS
I admit, I was once put under the spell of OPS. It was so easy to calculate, and seemingly so much better than its peers at the time, that it seemed absurd that more people didn't use it.
OPS was first introduced to the sabermetric community at large by Pete Palmer and John Thorn, in their classic tome The Hidden Game of Baseball. But you know what? OPS is bad. It destroys information that is essential to player evaluation.
Please, allow me to demonstrate.
Correction Appended
Table of Contents
What Got Me Thinking
Calculus Class
The List
Discussion Question of the Day
Andy over at the Baseball-Reference blog wrote a post about Ryan Howard. He was following up on a piece he wrote at the end of last season. In that 2008 post, he wrote:
First, check out this list of lowest OPS+ values for a guy with 47+ HR. It's been done 92 times, and only twice has a guy been under 130. Think about his OPS+ of 121...does that seem like a number that a guy leading all of baseball in HR and RBI should have?
A large part of it is his low batting average. Two years ago, he hit .313 and slugged .659 with an OPS+ of 167. This year, he's batting only .248 and slugging .537. That's 65 points off his batting average and 122 points off his slugging average. His extra bases are roughly the same--the difference is actually about 30 fewer singles, and the fact that he's got more ABs this year.
And then just yesterday, he wrote this:
Howard's walk totals have dwindled over the last 4 seasons, from 108 and 107 in 2006 and 2007, down to 81 in 2008 (in more games) and projecting to fewer than 80 this year. You can see this trend in his BB% right here. Of note is that his walk dropoff seems to be due mainly to fewer intentional walks, going 37, 35, 17, and just 3 this year.
Anyway, this is the second straight year that Howard will finish with monster HR and RBI numbers but an OPS+ under 140 (although he's historically had massive Septembers, especially last year with 11 HR and 32 RBI.)
So, according to Andy, Howard's low OPS+ are because of his batting average and his walk rate falling. Sounds pretty bad, right?
Only thing is, Ryan Howard is still a good hitter. He's been worth 22.9 batting runs above average this year, and 19.9 last year. There can be no doubt that he is not as valuable as he was in 2006, when he was worth 60.6 (!) batting runs above average. So why the low OPS+ numbers?
It's been a while since I've taken a calculus class, but since people seem to enjoy watching me flail around with numbers, I thought I would indulge those folks. So let's try some basic calculus.
OPS, as we all know is on base percentage plus slugging percentage. In equation form:
OPS = OBP + SLG
Now, since I'm lazy and didn't use a calculator for this (consider this your warning), I made some simplifying assumptions. I'm ignoring HBPs (with apologies to Plunk Everyone) and SFs. So,
PA = AB + BB
OBP = (H+BB)/PA
SLG = TB/PA
Rewriting, we get
OPS = (H+BB)/PA + ((TB/(PA-BB))
All I did was substitute the definitions into the original equation. Everyone with me so far? Cause here's where it gets a little complicated. I took the derivative of the above equation with respect to walk rate (BB/PA). That required some slight rewriting:
OPS = H/PA + BB/PA + ((TB/(PA-PA*(BB/PA)))
Okay, we're starting to accumulate an impressive number of parentheses, which in my view are like bacon. Better to err on the side of more. Now, to take the derivative, we'll need the Quotient Rule. I'll leave the bloody details as an exercise for the reader, but feel free to check my work. Here's the resulting derivative function:
OPS'(BB) = 1 + (TB*PA)/(PA^2*(1-BB/PA)^2)
This formula tells us how sensitive a player's OPS is to changes in his walk rate. The higher the value, the more a shift in walk rate will affect his OPS.
The List
Ok, so now that I've done some mathematics, let me present you with a list, selected more or less at my whim, of how various players rate. I've used career statistics. In decreasing order of OPS'(BB):
- Albert Pujols 0.73
- Ryan Howard 0.67
- Hanley Ramirez 0.59
- Mark Reynolds 0.58
- Pablo Sandoval 0.57
- Joe Mauer 0.55
- Chris Davis 0.52
- Jimmy Rollins 0.47
- Ichiro 0.46
- Mark Bellhorn 0.46
- Pete Rose 0.45
- Erick Aybar 0.40
[Correction: Pablo Sandoval's numbers were incorrect due to a clerical error. I regret the mistake.] Keep in mind these are career statistics, and they seem to be pretty closely related to slugging percentage. That is to say, the higher the slugging percentage, the more sensitive a player's OPS is to changes in walk rate. But I am not sure I have figured out yet why Pablo Sandoval tops the list of players I have selected.
In any event, the fact that this doesn't work out to be close to (if not) the same number for each player suggests to me that OPS may be unfair on ol' Ryan Howard. After all, he's got a wOBA of .382.
Discussion Question of the Day
I know I didn't give you many links today, so I appreciate your patience. Now is your chance to get me back. Where have I gone completely wrong (as I'm sure I have)?
Also, I'd like to apologize in advance to my high school calculus teacher.
20 comments
|
0 recs |
Do you like this story?
Comments
This is pretty cool.
I’m on clear about what the OPS’(BB) figures mean though. Are they just a random number, or are they something actionable?
Smoltz.
It’s the derivative of OPS with respect to walk rate. So one way to put it is to say that they are the amount that OPS would rise for each unit change in BB/PA. The higher the number, the more sensitive to walk rate a player’s OPS is.
Now, whether that is actionable, I have no idea. But I’m curious to try partial derivatives with other statistics and do leaderboards with SQL when I have a bit more time.
by Tommy Bennett on Aug 28, 2009 8:44 PM EDT up reply actions
So one way to put it is to say that they are the amount that OPS would rise for each unit change in BB/PA.
So Pujols’ OPS would rise .73 points if he added a walk? That can’t be possible, that means he would have a 1.800 OPS if he drew a walk!! I understand that the numbers represent the volitity of a players OPS, but they don’t seem to have any value on their own, just in relation to other people’s OPS’(BB).
I could easily be misinterpreting this though.
Smoltz.
by vivaelpujols on Aug 28, 2009 8:51 PM EDT up reply actions
Well, that would be true if his BB/PA were to rise by 1, but that would be statistically impossible, as you can’t have more walks than plate appearances.
Let’s say, though, that he increased his career walk rate by .1, which could happen, he would increase his OPS by .073 (holding all his other stats constant). That would take him from a career OPS of 1.054 up to 1.127.
by Tommy Bennett on Aug 28, 2009 8:59 PM EDT up reply actions
Not "added a walk". We're talking rates, here, not absolute totals.
Beyond the Boxscore Not a member? Sign up.
by Sky Kalkman on Aug 29, 2009 10:27 AM EDT up reply actions
I actually did it with raw walk totals at first
But it was way too dependent on plate appearances to be of any value.
by Tommy Bennett on Aug 30, 2009 10:36 AM EDT up reply actions
Also, could you do that same equation with the wOBA formula?
For you initial hypothesis to be correct, you have to show that wOBA is less subject to variance than OPS based on walks.
Smoltz.
Stay tuned, sports fans
I hope to do this soon.
by Tommy Bennett on Aug 28, 2009 9:00 PM EDT up reply actions
You could do it with the approximation
wOBA = 1.8*OBP + SLG
wOBA = 1.8*(H/PA+BB/PA) + (TB/(PA-BB))
Which differentiates to:
wOBA’(BB) = 1.8 + (TB*PA)/(PA^2(1-BB/PA)^2)
by Tommy Bennett on Aug 28, 2009 9:11 PM EDT up reply actions
And the list of the same players
Albert Pujols 2.53
Ryan Howard 2.47
Hanley Ramirez 2.39
Mark Reynolds 2.38
Pablo Sandoval 2.37
Joe Mauer 2.35
Chris Davis 2.32
Jimmy Rollins 2.27
Ichiro 2.26
Mark Bellhorn 2.26
Pete Rose 2.25
Erick Aybar 2.20
Notice that they are clustered more closely together (the difference in scale is due to the difference in scale used for wOBA and OPS)
by Tommy Bennett on Aug 28, 2009 9:19 PM EDT up reply actions
Actually the slope of those two datasates are exactly the same
A perfect correlation:

What does that mean?
Smoltz.
by vivaelpujols on Aug 28, 2009 9:27 PM EDT up reply actions
Yeah
I really should have realized that. They are shifted by a constant term. I will check it out with the full wOBA equation.
by Tommy Bennett on Aug 28, 2009 9:56 PM EDT up reply actions
Wouldn't this be the weight of a walk?
I’m thinking about it right now. wOBA is a simple division. The numerator is a set of additions with weights on each, including walks. The denominator is PA. We should be able to split each event/PA, and since the other events are not dependent on walks, all we’re left with is d/d(BB/PA) (.72*BB/PA), making the dependence static as .72. Sort of makes sense, as the weights are linear after all.
Did I miss something?
I agree
Going back to the original post, I’m don’t think I accept the premise that Howard’s OPS+ is out of line with his actual production—at least not based on OPS+ itself (B-R’s results may vary).
Howard has a .352 OBA and .572 SLG versus league averages of .331 and .411. That is an un-adjusted OPS+ of 146.
Using a simple linear weights formula (ERP), Howard has created 96 runs in 354 outs (.271 R/O). The league average is .177 R/O, so his runs/out relative to the league is 153.
Howard has a .301 EqA. That translates to a runs/out relative to the league average of 144.
Howard’s relative runs/out rate implied by OPS+ (which is what OPS+ is) is roughly in line with other relative runs/out estimates. B-R’s figures are lower for some reason (they do exclude pitcher batting from the league averages, but I don’t think that would make all the difference). The average NL team has an OPS+ of 94 according to B-R; if you apply a multiplier to bring that to 100 (100/94 = 1.064) and multiply Howard’s 136 by it, you get 145, again in line with the other estimates.
This is right as far as it goes
Going back to the original post, I’m don’t think I accept the premise that Howard’s OPS+ is out of line with his actual production—at least not based on OPS+ itself (B-R’s results may vary).
I didn’t intend that to be a premise, but rather the thing to be shown. Now you could argue (rather convincingly) that I didn’t do that, but I certainly didn’t begin with that as a premise.
by Tommy Bennett on Sep 1, 2009 2:10 PM EDT up reply actions
Also, OPS is overrating Howard because he is a low-ish OBP slugger
OPS+ has him at 121; however, according to Statcorner, his wOBA+ is 113.
http://www.statcorner.com/batterAJAX.php?id=429667&team=PHI&year=2009&leag=N_L
Smoltz.
Great discussion. OPS also has problems with double counting bases, right? Player A has a .340 OBP with a .250 BA, while player B has a .340 OBP with a .310 BA. Player B is going to have some advantage in his SLG% from that BA difference, right? And while a hit is better than a walk, it’s not like the math works out so the advantage player B has in hits vs. walks is equal to the additional production a hit gives vs. a walk. Now that we’ve created better stats, there’s just not much reason to use OPS anymore.
Okay, we’re starting to accumulate an impressive number of parentheses, which in my view are like bacon. Better to err on the side of more.
I have to say, I enjoyed that.

by 




























