For regular readers of this site, it is pretty much assumed that the use of analytics can have a positive impact on a team's on-the-field performance. Teams can use analytics to gain a competitive advantage in many different areas, such as front office decision making and in-game strategy. The use of analytics has grown dramatically throughout front offices in recent years, but teams differ in how much they use analytics and how big of a role it has in their decision making process.
If the use of analytics truly has an impact on a team's on-the-field performance, then it seems reasonable to expect analytically-minded teams to win more games than teams that are not as analytically-minded (after controlling for other relevant factors such as payroll). Anecdotally, we can point to many examples of teams doing well because of analytics, including the A's, the Rays, and more recently, the Astros. However, there are also teams such as the Phillies that have had stretches of success despite being openly hostile towards analytics in recent years.
With that in mind, I thought I would take a look at the 2015 season to see if analytical teams were more successful overall than non-analytical teams. To determine how analytical each team truly is, I decided to use The Great Analytics Rankings, which were published by ESPN prior to the this season. (Ben Baumer wrote the MLB-specific rankings.) For those of you who may not be familiar with these rankings, the following introduction is provided by ESPN.
"ESPN The Magazine and ESPN.com unleashed our experts and an army of researchers to rate 122 teams on the strength of each franchise's analytics staff, its buy-in from execs and coaches, its investment in biometric data and how much its approach is predicated on analytics. After looking at the stats, reaching out to every team and dozens of informed sources and evaluating each front office, we ranked an overall top 10 and bottom 10 and placed each team in one of five tiers by sport.
While it is probably hard to make these ranking truly objective, I feel comfortable using them for this exercise, since it appears as though they were developed using a thorough approach with many different sources of input. It also helps that these rankings were made before the season started, since this potentially eliminates any bias resulting from a team's results in 2015.
As stated above, each team was placed in one of five tiers: all-in, believers, one foot in, skeptics, and nonbelievers. To start, here's a look at how teams are distributed among those tiers.
While most teams in baseball appear to be involved in analytics to some extent at this point, the 2015 playoff teams skew heavily towards the more analytical side of this distribution. In total, the 2015 playoff contenders consist of five teams in the "All In" tier (Cardinals, Pirates, Cubs, Yankees, and Astros), four teams in the "Believers" tier (Royals, Blue Jays, Dodgers, and Mets), and one team in the "One Foot In" tier (Rangers).
Before we take this as conclusive evidence that a team's use of analytics can improve a team's winning percentage and odds of reaching the playoffs, I thought it would be helpful to see how often we would get the above distribution of playoff teams by chance. If a team's analytics tier had no effect on whether they made the playoffs or not, what would we expect the distribution of playoff teams look like, and how unusual would it be to get the distribution above?
To answer this question, I decided to use a chi-squared test, a form of hypothesis testing which determines how consistent an observed sample of data is with a particular theoretical (i.e. expected) distribution. Since ten out of thirty teams qualify for the postseason, our theoretical distribution would have one-third of teams in each tier making the postseason. While a particular random sample may not have this exact distribution, there are certain random samples that are unlikely to occur, and the chi-squared test attempts to quantify the unlikeliness of a particular distribution like the one above.
In this particular example, I ended up with a chi-squared value of 5.69, which corresponds to a p-value of 0.22. Unfortunately, this p-value is not quite low enough for us to reject the null hypothesis and say definitely that a team's analytics ranking has something to with whether or not they made the playoffs. Even so, the presence of analytically minded teams in the playoffs is something worth keeping an eye on in the coming years.
Let's try a different approach. Here is a scatterplot which shows the analytics tier and 2015 winning percentage of all 30 teams. (I placed the analytics tier on a 1-5 scale, with five corresponding to the "all in' tier, four corresponding to the "believers" tier, etc.)
There appears to be a moderately strong positive relationship between a team's winning percentage and analytics tier, at least in 2015. The coefficient of determination is .369, which means that nearly 37 percent of the variance in team winning percentage in 2015 can be explained by a team's use of analytics. While it is true that a lot different factors go into a team's analytics ranking, many of which can have an impact throughout an organization, I find it fascinating that a single factor like this could have an impact of this magnitude.
Ultimately, we probably cannot say a whole lot based one season's worth of data, but it is nonetheless reassuring to see evidence of something we've assumed to be true for a long time, namely, that an increased use of analytics can play a factor in a team's on-the-field success. While almost all major league teams use analytics to some extent in this day and age, there are still differences between the most analytically minded teams and the rest of the league. It remains to be seen how long these differences will remain what sort of impact they will have on team success moving forward.
* * *