FanPost

The First Advanced RBI Replacment - Weighted Runs Batted In Efficiency

Imagine throughout high school, teachers gave their favorite students easier tests than the rest of the class. The result would be clear: the majority of the favored students would come out with strong scores. However, one would question if those strong scores would be a result of high intellect or because of an easy test. Contrarily, there would be other students who would still score well while given a difficult test. Now, there’s an issue. If the teachers want to know which of the students know the material the best, how should they figure it out? They know that they can’t take the highest score, because they are aware that the scores are not an accurate representation due to the skewed tests. This is the exact situation in which the RBI – Runs Batted In - has put the baseball world.

When the RBI was first documented as an official statistic in 1920, the wording of the definition in Rule 86, Section 8 of the Official Baseball Rules was "The number of runs batted in by each batsman.[1]" Although this definition was slightly vague, its intention was to quantify which batter is the best at batting in runs. For years, this statistic has been praised. The RBI is always one of the first statistics to be mentioned while summarizing a player’s year and career. The RBI is even in the most prestigious hitting award: The Triple Crown – Highest Average, Most Home Runs, Most RBI. Despite its strong reputation, over the last few years, it has become clear that the RBI is not answering the question it’s posing. The RBI doesn’t answer "Which batsman is the best at batting in runs?" The RBI only answers "Who has batted in the most runs?" Although that may seem like a small wording change, the two questions are tremendously different.

The RBI may seem like a valuable statistic, but, there are two main assumptions that drastically devalue the statistic: (1) the idea that all hitters have the same quality of plate appearances and (2) that all RBI are created the same. The first assumption can be broken down into two different factors, the Batting Order effect – the concept that players will get different quality of plate appearances depending on their spot in the order - and the Team Effect – players on teams with better offenses have an easier time accumulating RBI. Table 1 below shows that the quality of plate appearances – a plate appearance with a man on base being high quality while a plate appearance with no one on base is low quality - depends on batting order for the entire MLB from the years 2015-2019[2]:

Table 1

MLB 2015-2019

Batting Order:

Plate Appearances

Plate Appearances with Men On

Percentage

1

113,119

36,977

33%

2

110,540

45,550

41%

3

107,970

49,375

46%

4

105,538

51,150

48%

5

103,098

45,710

44%

6

100,491

43,532

43%

7

97,711

43,284

44%

8

94,797

41,805

44%

9

91,810

40,548

44%

Average:

102,786

44,214

43%

This shows a strong correlation to the idea that quality of plate appearances is reliant on the batting order. Most starkly, the average number 4 hitter has 48% of their plate appearances with a man on base while the leadoff hitter only has 33%. In fact, the average hitter will get 43% of their plate appearances with a man on base, meaning that the leadoff hitter has a 10% disadvantage and the 4 hitter has a 5% advantage. This can be attributed to the fact that on a typical team, players hitting 1 or 2 will have the highest On Base Percentage, leading to the 3 and 4 hitters having a plethora of plate appearances with men on base. On the other hand, players batting 7,8 or 9 will typically have the lowest On Base Percentage directly causing the 1 hitter to have a minimal amount of plate appearances with a man on base, and subsequently fewer premium spots to collect an RBI.

Throughout a game, it is equally difficult for a hitter to hit a single, or a homerun or any action in every spot in the lineup, as those solely rely on one’s ability to hit. However, the difficulty of collecting RBI changes greatly depending on the spot in the order in which the hitter hits as it also relies on others getting on base. This means that some hitters simply have an easier time driving in a run, creating an uneven playing field. This plays a role into the making of an RBI and adds an unfair element to hitters not hitting in premium spots.

Furthermore, the Team Effect also contributes to the inequality of plate appearances. A team that scores more runs, will undoubtedly accumulate more RBI. This means that players who play for the best offense will have a higher likelihood of being an elite RBI collector. The following table, Table 2, represents the frequency in which a player who ranks in the Top 10 RBI for the year plays for an above average offense, according to runs scored, from the years 2015-2019[3]:

Table 2

Year

Percentage of Players on Above Average Offense on Top 10 RBI Leaderboard

2019

70%

2018

90%

2017

70%

2016

80%

2015

100%

Average

82%

This suggests a strong correlation between a team’s offense and resulting RBI leaders. This correlation is due to the fact that better offenses have runners on base at a higher frequency, so it is much easier for their hitters to drive in runs. In a statistical baseball world where everything can be quantified in a vacuum, this prominent metric has clear outside factors. Therefore, having a cumulative statistic where there are multiple factors out of the hitters’ control that completely skew results is not a good representation of the desired results.

As a case study, Anthony Rendon of 2019 is the perfect example. Rendon lead the league with 126 RBI. However, the Batting Order effect and Team effect were both prevalent. Rendon hit in the 3rd spot for 137 of the 146 games he played. He also played for the World Series Champion Nationals who ranked 6th in the league in runs scored. Of his league leading 126 total RBI, Rendon also lead the league in RBI from 3rd base, driving in 43 runners. He was 3rd in the league in amount of plate appearances with a man on 3rd base. Meanwhile, he ranked 76th in plate appearances with no one on base and 42nd in plate appearances with a man on 1st base[4]. So, was Rendon just very lucky to hit number 3 for the 6th best offense? Did he just make the best of a great situation? Would he still have led the league if he had worse quality of plate appearances? Using the RBI, it is impossible to know whether or not Rendon was truly an elite hitter at driving in runs or if he just took advantage of his circumstances.

The last assumption that the RBI makes is the idea that all RBI are equal. The RBI presumes that every run batted in is just worth 1. On the surface that may make sense because a run is worth 1 in baseball, so a run batted in should also be worth 1. However, one must take into account the level of difficulty when driving in a run. Should a soft line drive that scores a man who was on third base be worth the same as a home run? Or even as a strongly hit double in the gap that scores a runner from first. What if there is a man on third and the hitter hits a soft ground ball and the man on third is able to score. No, they should not be worth the same. It is far easier to drive in a run from third than it is from second or first, or with no one base. Level of difficulty has been added to every relevant baseball statistic, such as slugging percentage where when quantifying power attributes, a triple is worth more points than a single. So, if the RBI is truly trying to determine the best at driving in runs, then level of difficulty of driving in a run must be taken into account.

Since the explosion of Sabermetrics, these gaping holes in the RBI have been assessed numerous times by quite a few statisticians. In fact, there have between multiple new statistics that look to solve this problem, vaguely. However, the question from the RBI changes from "Who is the best at driving in runs?" to a more principal question, "Who creates/produces the most runs?" This is arguably the most important possible statistic as the goal of baseball is to score more runs, so the best hitters will create and produce the most runs. So, the creator of Sabermetrics Bill James quantified one’s ability to create runs in a statistic called, Runs Created, or RC. RC is able to make an estimation of a player’s offensive value in terms of runs by taking the hitters ability to hit and get on base – a key distinction from a statistic that only quantifies a player’s ability to drive in runs. Furthermore, another statistic was created with a similar goal: Weighted Runs Above Average – wRAA. wRAA is similar to RC in that it quantifies a player’s offensive value in terms of runs, however instead of having an outright number, it compares the hitter’s ability versus the average hitter. These statistics are both fantastic statistics to answer their question, however nevertheless, the question "Who’s the best at driving in runs?" is not being answered.

Both RC and wRAA provide solid alternatives for the RBI. They are much closer to answering the question than the RBI as neither of them have outside factors affecting the results. However, it was clear that these are solely alternatives, not replacements. Not direct solutions to the problem. So, the OBI% - Others Batted In - was created. This is another reputable statistic. This is an efficiency of how well a hitter can drive in runs. In fact, this completely fixes the first assumption that the original RBI makes – that all hitters have the same quality of plate appearances. OBI% is an efficiency, which in turn creates a neutral playing field for all hitters. However, there are still problems in OBI% when answering "Who’s the best at driving in runs?" The second assumption is not addressed – the inequality of driving in runs depending on situations – and also that home runs don’t count towards this metric as this only counts "Others" and not driving in oneself. In OBI%, each situation is worth the same, so being efficient at driving in runs from 3rd is valued the same as driving in runs from 1st base. Furthermore, if the question is talking about driving in runs, then home runs is an important aspect. A homerun nevertheless drives in a run. Although many other questions have been answered, the question originally posed in 1920 has still yet to be truly answered.

Almost every single aspect of the game of baseball has been quantified in advanced fashion. There are advanced baserunning metrics, advanced fielding, there are statistics about how many rotations the ball makes on every single throw in the ballpark. So, why is there no advanced statistic about RBI? Why has the question been left unanswered for so long? Therefore, I have created the statistic Weighted Runs Batted in Efficiency.

wRBIe is the sum of efficiencies of driving in runners from each base, weighted. This statistic directly addresses the two assumptions from the RBI. First, like OBI%, this is an efficiency rating. This ensures an equal playing field for all hitters. It doesn’t matter how many times a player gets plate appearances with men on 3rd, it solely matters how well they perform in those situations.

Next, wRBIe weights every RBI situation. The weight used in wRBIe was initially created by statistical analyst Tom Tango. He created a metric that quantified the chance of scoring from base, depending on each base[5]. So wRBIe takes chance of scoring and uses the reciprocal to quantify how difficult is it to score from that base. This means that if the chance of scoring from 1st base is .265, then the difficulty of scoring is 1/.265 which is roughly 3.77. wRBIe then multiplies that weight by the hitter’s efficiency of driving in runs from 1st base. The sum is then taken of all the efficiencies. In all, the formula for wRBIe is:

wRBIe = (((DoS0)(E0)) + ((DoS1)(E1)) + ((DoS2)(E2)) + ((DoS3)(E3))) * ScaleAdj

Where: DoS is difficulty of scoring (reciprocal of chance of scoring), E is efficiency of scoring runners from each base (Runners driven in from that base/Plate Appearances with runners at that base) and the digits are the location of the runners. "ScaleAdj" is the added weight to make a simple scale, with the exact adjustment being (2000/31). Furthermore, the difficulties of scoring are as such:

Table 3

No one on

Man on 1st

Man on 2nd

Man on 3rd

8.403361345

3.773584906

2.37529691

1.66944908

And the scale is as such:

Table 4

Superstar

135

All Star

120

Starter

100

Role Player

85

Bench

70

Scrub

Less than 70

Now that there is finally a metric that quantifies a hitter’s ability to drive in runs, the leaderboard can be found. For the first time, it will be possible to find which hitters are truly the best at driving in runs. The follow table, Table 5, compares from 2019 the top 20 hitters in terms of wRBIe and RBI:

Table 5[6]

Top wRBIe

Top RBI

1

Nelson Cruz

Anthony Rendon

2

Freddie Freeman

Jose Abreu

3

DJ LeMahieu

Freddie Freeman

4

Eric Hosmer

Pete Alonso

5

Anthony Rendon

Eduardo Escobar

6

Max Kepler

Nolan Arenado

7

Anthony Rizzo

Jorge Soler

8

Nolan Arenado

Xander Bogaerts

9

Charlie Blackmon

Josh Bell

10

Pete Alonso

Cody Bellinger

11

Bryce Harper

Rafael Devers

12

Rafael Devers

Bryce Harper

13

Mike Trout

Alex Bregman

14

Cody Bellinger

Juan Soto

15

Marcus Semien

Eddie Rosario

16

Eddie Rosario

Nelson Cruz

17

Josh Bell

JD Martinez

18

Alex Bregman

Mike Trout

19

Danny Santana

Yuli Guriel

20

Austin Meadows

Eugenio Suarez

Table 5 shows exactly what wRBIe hoped to accomplish: giving true credit to hitters who can drive in runs at an elite level with a neutral playing field. The RBI had been the quickest, easiest, simplest way to quantify a hitters ability to drive in a run, and there is a minor correlation between the two leaderboards – 55% in common – but the fact that leadoff hitters like DJ LeMahieu (3) and Max Kepler (6) are on this list while also having 20% of these players be on teams with below average offenses show the power of a neutral playing field.

Albeit, while looking at this table, one may wonder if the Team Effect has actually been neutralized. Table 2 on Page 4 shows that in 2019, 70% the players ranked top 10 in RBI played for above average offenses – in terms of runs, 20% lower than that of the wRBIe. However, this trend is unique to 2019 alone, as 2019 was an outlier. From the years 2015-2019, 74% of the players ranked top 10 yearly in wRBIe played for above average offenses while 82% of top 10 yearly RBI collectors played for above average offenses – an 8% difference – with the biggest difference being in 2016 when 80% of top 10 RBI hitters played for above average teams compared to that of 50% for the wRBIe. This 8% gap highlights two very important results of the wRBIe. First, wRBIe is not a function of the team for which that a player plays. The frequency of wRBIe leaders playing for above average offenses is noticeably less, creating a more equal world for this statistic. However, 8% is not that substantial. This can be attributed to the fact that players who can drive in runs efficiently, play for the best teams – 74% of the time in the last 5 years. Prior to the wRBIe, Front Offices have combined many different statistics to try and find players to drive in their runs efficiently, such as wRAA, RC and OBI%, however now, the wRBIe provides one simple statistic to quantify this aspect of hitting.

Furthermore, in Table 5, only 55% of the wRBIe leaders hit in either the 3rd or 4th spot of their teams’ lineups. This is a sizeable difference as it compares to 80% to that of the RBI leaders. Table 6 below shows that that trend has been consistent over the past 5 years as wRBIe leaders are 14% less likely to hit in the premium spots:

Table 6

Percentage of Top 20 Yearly Performers Batting 3rd or 4th from the years 2015-2019

RBI

77%

wRBIe

63%

Table 6 shows the exact reason why the wRBIe has been created, and how it solves the issue. wRBIe has created more of an opportunity for players hitting outside of the two most premium spots in the lineup in terms of quality of at bats – refer back to Table 1 on Page 2. Table 6 shows that there is a 14% decrease in players hitting 3rd or 4th from the RBI to wRBIe, meaning that there is much more of an opportunity for players hitting in other spots of the lineup to be recognized for their ability to drive in runs. For example, 1 hitters are now more than 3 times more likely to appear on a wRBIe leaderboard as they are for that of an RBI. This shows that just as the wRBIe is not a function of the Team Effect like the RBI, wRBIe is also not a function of the Batting Order.

The two major applications for the wRBIe have to do with lineup creation and pinch hitting. First, wRBIe can actually spark two schools of thought while creating a lineup, both valid. One would be to slot the hitters with the best wRBIe – the best at driving in runs – in the 3/4/5 spot because those are the plate appearances that have runners on base most occur. This would give those hitters the best chance at maximizing their abilities. Contrarily, another manager may say that because they know their top wRBIe hitters can produce runs in any situation, they won’t slot them in 3/4/5 and give those spots to players who need more help driving in runs, players less efficient. For example, a player with a wRBIe above 120 – see Table 4 for reference – will be able to drive in runs regardless of the quality of at bats in terms of plate appearances with men on base, whereas a player with a wRBIe in the range of 70-100 will drive in runs at a higher rate if they are slotted in the 3/4/5 spots because they will be helped by a premium batting spot with more plate appearances with men on base.

The next application has to do with substitutions during a game. In general, there are four reasons for a manager to substitute a non-pitcher during a game: injury, defensive substitution, pinch running or pinch hitting. Typically, when a manager performs a pinch-hitting substitution, it is to have a better chance at driving in a run in a critical spot. Not only does this situation show the power of pinch hitting in general, but this shows how wRBIe can change the rolls of benches forever. Throughout the year, teams acquire players in order to further their chances of winning a world series. Teams are allowed to have up to 13 batters at time – meaning 5 bench hitters – so, it would make sense to fill one’s bench with players who are most effective at driving in runs, now measured most accurately with the wRBIe. Now that there is a way to quantify one’s ability to drive in runs, teams will search for those hitters to stockpile their benches so they can more effectivity and efficiently use their pinch-hitting tools. However, when it comes to pinch hitting, the situation at hand is crucial. Factors such as the pitchers’ handedness, how many outs, how well the hitter has fared in previous plate appearances with the pitcher and location of runners will affect the managers decision of who to use as a pinch hitter. So, it is the front office’s job to provide the manager with the most efficient batters at driving in runs to choose from, a job that is most effectively done using the wRBIe.

In 2019, Anthony Rendon was the kid in high school who was the favorite student and was given an easy test. He passed the test with flying colors and made the best of his situation. Fans were quick to deem him the best at driving in runs. In reality though, Nelson Cruz – 16th in RBI - lead the league in wRBIe and should’ve been crowned best at driving in runs as he had one of the best seasons ever recorded standing at 153 wRBIe, while Anthony Rendon was 5th with a 138 wRBIe. There is a consensus that the RBI is going to be obsolete in the near future, however for the first time ever, wRBIe is a direct advanced replacement that finally answers the question, "Who is the best at driving in runs?"


[2] According to Fan Graphs

[3] According to Fan Graphs

[4] According to Baseball Prospectus

[6] All wRBIe data gathered from baseballprospectus.com