Sabermetric Primers
What Starting Pitcher Metrics Correlate Year-to-Year?
As a follow up to my previous article on hitting metrics, I wanted to take a look at those pitching metrics that correlate year-to-year. For this installment, I looked at starting pitchers from 2004-2011 with at least 162 innings pitched in year one and year two.
As before, this is just a straightforward correlative analysis--nothing fancy. I took a look at a bevy of metrics (courtesy of the fine, upstanding citizens at FanGraphs), and here are the results:
Pitcher repertoire generally has the highest correlation, year-to-year (Y2Y). The distribution of their pitches (i.e. four-seam fastball, cutter, change up, etc.) shows great consistency from one year to the next. Now, there are potentially coding errors in that data, but the consistency of those statistics reflects what I think is generally known--that once a pitcher makes it to the big leagues as a starter they rarely alter their portfolio of pitches. What they likely alter, more regularly, is speed, sequence, and location. But that's just a hypothesis, one that can't be confirmed or rejected with this data.
Moving on.
Opportunities: Another Flaw in Fielding Metrics?
Advanced fielding metrics are often the topic of heated debate in all baseball circles. Most of the discourse surrounding these statistics has to do with the accuracy of the subjective data used. The phrase, "garbage in, garbage out" has been uttered on more than a few occasions. Though, for a second, let us assume the data's precision and discuss what these metrics tell us.
Each statistic conveys information. As fans and analysts it is our duty to choose which metric can be utilized to fit our needs.
For instance, let us look at Runs Batted In. RBI is a descriptive statistic. It clearly tells the reader the amount of times a runner has crossed home during and following a hitter's plate appearances. As you know, RBI has few uses because it lacks context. Failing to account and neutralize for the amount of RBI opportunities one has is just one problem with the old school metric.
RBI clearly is not a predictive statistic. A predictive statistic is one that correlates highly from year to year.[1]. What a player accomplishes in year X, we should expect him to accomplish in year X+1. As an example, look at xFIP. xFIP does not tell us what happened in the current year - how a pitcher was able to prevent runs - but rather tells us how one should have been able to prevent runs by looking at one's strikeouts, walks, while assuming a league average homerun rate. The assumed home run rate isn't even something that actually happened.
Given nature of predictive statistics, many look to these constructs to asses a player's true talent level.
A problem with subjective data fielding metrics is that fans and analysts alike look at them and think true talent level. Viewing fielding metrics through a true talent level lens, it is understandable one has trouble reconciling Jacoby Ellsbury's wildly fluctuating defensive statistical performance. [2].
Each opportunity for a player's On Base Percentage is marked by a plate appearance. It's safe to say that outside of the level of competition and park factors, all plate appearances are created equal. Get on base, or make an out. Simple. Clear. [3].
Often it is overlooked that the opportunities that are used for advanced fielding are not as clear. Again, forget the accuracy of the data employed and consider that two right fielders can be hit 100 fly balls, yet both have had a vast different amount of opportunities. Why? Because each fly ball - even if categorized to perfection - is different. In the given scenario, one right fielder's defensive statistical performance can be stifled by a lack of difficult opportunities while another's can be bolstered by a high volume of difficult plays.
Because the pool of fielders can have a wide array of opportunities causing fluctuating outcomes year-to-year, subjective fielding metrics are descriptive statistics, like RBI, and do not give the reader an idea of a fielder's true talent level. Conceptually, these statistics are not as useless as RBI. They tell the reader what happened using complex linear weights.[4]. But, don't fret when your perception is contradicted by these metrics. The disconnect is created, in part, by the variance in the difficultly of opportunities each fielder faces.
7 comments
|
2 recs |
Tweet
What Hitting Metrics Correlate Year-to-Year?
The world of baseball does not want for statistics. Indeed, at this point there is a metric for just about any outcome one could be interested in (as well three or four different versions of it).
But not all statistics are created equal. Each tells us something different. As Dave Cameron recently pointed out, one can group baseball metrics into a number of different categorical schemes, one of which being descriptive versus predictive.
This is why it is important to not just know the numbers, but the story behind the numbers. How are they constructed? What are they meant to capture? Some statistics are simply reflections of current performance while others reveal more about a player's skills or talent outside of a single year. It's critical that we understand the difference.
One place to start in order to bucket metrics into one of these two categories is the extent to which a metric correlates year over year. If a statistic in year one does not correlate all that well to itself in year two it generally is more descriptive than predictive.
This is the underlying logic behind pitching metrics like DIPS and FIP. Since ERA has a generally low year-to-year correlation (.38), it was a poor predictor of future performance and true talent.
I think if you ask most people what offensive statistics correlate year-to-year you won't find many confident answers. In order to help us along the journey I decided to run some correlations for common, and uncommon, batting statistics. For those that live in SQL, these numbers are probably well known. But for most, I think it is helpful to have them posted for reference.
Here are the results:
20 comments
|
6 recs |
Tweet
Anatomy of a Luck vs. Skill Evaluation
When reading saber-slanted writing you will often find writers debating whether a player's accomplishments in a given season are the result of skill or luck.
The discussion goes something like this: "Player X is having a fantastic year at the plate, but his offensive statistics are largely the result of luck since his (insert favorite peripheral stat of choice) is well above average." Batting average on balls in play is arguably the most common statistic used in this fashion.
The discussion can be applied to pitchers as well, and the pronouncement of luck can also take many forms (i.e. the player is lucky/skilled since their BABIP is higher/lower than the average).
But in truth it is more complicated than that. It's one thing to go down a level of analysis beyond the mere descriptive statistics of OPS or wOBA, quite another to drill down further to ensure that the shortcut one uses to pronounce a player lucky actually holds in a given situation.
We are lucky enough to have discovered a number of generalizable findings in the field of baseball that provide great context when trying to evaluate player performance. The problem is that sometimes we get lazy and deploy these findings too quickly when analyzing whether a player's performance is reflective of their true talent or simply the result of randomness.
To flesh out this idea I would like to present an anatomy of how one might make a luck versus skill call. I'll use Tampa Bay's Casey Kotchman and his .367 wOBA as an example.
Everything You Always Wanted to Know About UZR But Were Afraid to Ask
The ability to substantiate defensive performance in an analytical form has become, in essence, almost a necessity for sabermetricians in need of verification of whether a fielder is a solid defender or not. Since 2002, UZR (Ultimate Zone Rating) has been the most commonly used defensive metric for those who have cognition of it's importance. Heck, the majority of people that have the time, patience, and an abundance of baseball comprehension are aware of the difference between UZR and highly flawed statistics such as fielding percentage and errors. Even with one's ability to bestow his or her knowledge of UZR, many don't seem fully aware of it's background or the ways a certain fielder can improve or negate his Ultimate Zone Rating. Before we dig in to that, confabulating the background of UZR sounds like a grand idea, wouldn't you say?
Founded by Baseball Think Factory's Michael Litchman, Ultimate Zone Rating is best defined by -- sans the pitcher and catcher -- a fielder's ability to get to balls in 64 of the 78 zones. It's not all batted balls, however, as outfield balls hit foul, infield line drives, and infield pop flies are not considered in the formula. In simplest form, Ultimate Zone Rating is the amount of runs above or below average a fielder costs or earns his team within certain zones using four metrics, well actually, three:
-
ARM (Outfield Arm Runs): First and foremost, keep in mind that ARM is only used for outfielders while DPR is only used for infielders, hence the reason why I suggested it's really three components, not four. ARM basically speaks for itself. It's the amount of runs an outfielder saves by throwing out the expected amount of runners. The more runners he throws out, the better his ARM will be.
-
RngR (Range Runs): This denotes the amount of runs above or below average a fielder is (infield & outfield) by determining the ability of a fielder to get to the expected batted balls in his zones. As you'll see below, UZR is larglely dependent on RngR.
-
ErrR (Error Runs): Given the same amount of batted balls hit to a certain fielder, ErrR determines the amount of runs he saves by his ability to prevent errors compared to an average fielder at the same position.
-
DPR (Double Play Runs): Again, this is solely used for infielders. Taking "handedness" in to account, DPR determines the amount of double play outs at second against the league average. Of course, this is primarily and more significantly used for middle infielders, but it applies to the whole infield quad as well, which you'll also see in the charts below...
12 comments
|
1 recs |
Tweet
Is Ubaldo Jimenez the Rockies Worst Starting Pitcher?
Very few pitchers over the last couple of seasons have been as fun to watch as Ubaldo Jimenez. The right-hander has made two straight Opening Day starts and has since been tagged with "ace" status. As the Rockies success over the last two plus years has gone from "hottest team in baseball" all the way to "what is going on with this team?" it's been largely in part to the success and failure of Ubaldo.
The Rockies have gotten off to a fantastic start this year. With a 16-7 record to boast, almost each and every component on the Rockies squad has contributed in some way. Ubaldo Jimenez is one of the few men in Purple who hasn't generated the Rox at least 0.1 wins. To go along with that, he looks to be the same Ubaldo to whom we saw during the second half of the 2010 season -- That would be a pitcher who walks almost a whole hands worth of batters per start and can't get through 6 or 7 innings like he's been known to do. Alas, his FIP since June 26th (his first start in 2010 where he gave up more than 4 ER's) is 3.95. Since then, we've seen his walk total and HR/FB rates increase while his K rate decrease. He simply hasn't been the same pitcher to whom we saw breeze through each and every inning of the first two months or so of the 2010 season.
The Definitive Sabermetric Guide to Managing

Earl Weaver got it. We need more Earl Weavers.
If more and more GMs are getting it, why aren't managers?
Over the last 30 years, there has been a revolution in baseball thought. Long held, traditional conceptions of player value, in-game tactics, roster construction and organization building which had grown into truisms of the game, and even worse things that "everybody knows," have been re-evaluated, tested, re-thought and in many cases debunked. Much of the old artifice of baseball theory has been torn down and replaced by a new architecture of ideas which have been widely tested, (informally) peer reviewed, re-tested and improved upon.
While these new ideas and the "question everything" philosophy behind them have not been internalized by all baseball fans, sports media, coaches and players, over the last 15 years, they have found a home in many MLB front offices. By most accounts, all major league front offices utilize statistical analysis to some degree. And it appears that many have been willing to change how they evaluate players, build their teams and run their organizations.
One of the key tenets of the new baseball thinking, which I will refer to as "sabermetrics" for the sake of ease, is that teams should try to find market inefficiencies and exploit them to gain an advantage over their opponents. But this concept has not yet trickled down to the managerial ranks in any significant way. I contend that there are significant advantages to be had by MLB teams by having their managers act more "sabermetrically". And I am quite curious as to why general managers have not pushed their managers in this direction. I would argue that they should. Here's how:
59 comments
|
24 recs |
Tweet
A Missing Element in Pitcher Batted Ball Tendency?
Often we look at a pitcher's ground ball percentage relative to his fly ball or air ball percentages to look at the dominant tendency his batted balls take. However, not all percentages are of equal importance and that is especially true for low strike out pitchers. By looking at strikeouts in addition to balls in play, we can get a better feel for a pitcher's true batted ball tendency.
For instance, both Adam Wainwright and Carl Pavano both had a ground ball percentage (GB%) just above 51%. However, in 30 less innings last season, Pavano induced 376 ground balls as compared to Wainwright's 319. The difference can be attributed to Wainwright ability to strike batters out.
6 comments
|
1 recs |
Tweet
Showing 1 - 8 of 25 Older

by 
by
by 













