clock menu more-arrow no yes

Filed under:

The methodology behind the Baseball Hall of Fame projections

New, 2 comments

Yesterday, I released my projections for the Baseball Hall of Fame. Today, I explain the methodology behind it.

National Baseball Hall of Fame Induction Ceremony Photo by Jim McIsaac/Getty Images

I released my Baseball Hall of Fame projections for the Class of 2019 yesterday. As promised within that article, I want to dedicate this piece to explaining my methodology behind the projections. This will allow for more clarity as to where the numbers are coming from, in hopes that I can improve upon my model in future years.

The Sample

Thanks to the collection efforts from Ryan Thibodaux, I was able to get a simple random sample of voters. I wanted a large enough sample in order to calculate the most accurate results, and we learn in AP Statistics that categorical — or non-numerical — data that (total votes)*(percentage) and (total votes)*(1-percentage) both should be greater than five. In our sample, this would mean, for example, that (50)(.75) needs to be greater than 5, as does (50)(.25). This guarantees the normality of our sample distributions for each individual player.

Every single player with the exception of Mariano Rivera (because there were no ballots without his name listed on it) passed this normality test with our 50-ballot sample. Because the vast majority of players pass this normality test, we are able to build confidence intervals that are both accurate and do not skew weirdly. (Rivera’s confidence interval, for example, goes above 100 percent. We know that this is not possible.)

The second important factor here is the rule of independence. Basically, what this states is that one data point cannot influence another. Generally speaking, in order to satisfy this condition, you want your sample to be smaller than 10 percent of your entire population. In this case, that would be about 41 ballots. My AP Statistics teacher, Mr. Grossman, said that this is just a guideline; a 50-ballot model run would be fine. But, now that Thibodaux has collected upwards of 80 ballots, the odds that the standard deviations could be weirdly skewed are much higher.

The Projection for Returning Players

With my sample clearly established, I thought I could then use basic statistical modeling to determine my projection. I was wrong.

When I initially decided to tackle this project in late November, I built a simple system that treated the ballots that Thibodaux collected as a representative sample of my entire population.

This issue with this, though, is that it’s not representative. Voters who make their ballots public vote across different trends than those who don’t. There are even different trends among voters who make their ballots public before the Hall of Fame announcement is announced and those who make their ballots public after.

I built all of this in, weighing each of these expected percentages based on the general breakdown in previous years. For example, public, pre-announcement ballots generally make up 60 percent of the overall electorate; public, post-announcement ballots generally make up 15 percent of the overall electorate; and private ballots make up the remaining 25 percent.

What I had from Thibodaux’s projections alone was a simple random sample of public, pre-announcement ballots alone. From the data that we have, there is no way to determine how the rest of the groups will vote.

I solved this problem simply, but (I hope) effectively. I realized that private voters don’t generally change their voting trends in droves, to the point that I could even assume that a player would receive the exact same percentage among private ballots in back-to-back years. Of course, these percentages can change, and any small changes are built in to the standard deviation. But, when backlogging this model for the Class of 2018, I found that this was an incredibly effective method.

For example, after 50 ballots, Barry Bonds had received 35 votes, or about 70 percent of the electorate. Across private ballots last year, Bonds only received 41.9 percent of the vote. So, in my model, I am assuming that Bonds will receive 41.9 percent of the private vote again this year. While that could increase (or decrease), I found from previous years that this does not change significantly from year to year, so I felt fine using this as the best estimate of how the private voters would vote again this year.

I used previous year vote totals for about 40 percent of each returning player’s overall projection, so this is likely the biggest fault in my system. If any player sees an uncharacteristic spike in their post-announcement or private vote totals, then my model may suffer. This is an assumption that I am making that could prove to be costly.

One player this could significantly impact is Edgar Martinez. Martinez could receive a significantly higher proportion of the private ballot because it is his final year on the ballot. Voters tend to give players a longer look when they are at risk of losing their Hall of Fame eligibility. This could happen for Martinez among private voters especially, as they only voted for him 52.4 percent of the time in 2018. So, it’s particularly encouraging that my model believes that Martinez has a 64 percent chance of being elected with the 2018 private ballot proportion built in.

The Projection for First Ballot Players

Since my methodology rests heavily on the results from the previous year, this system would not work for first ballot players, who had no previous results.

I got around this by looking at the pre-announcement public, post-announcement public and private vote percentages from every previous year that tracking was available. I figured out the average drops and applied them to all of the first ballot players.

This isn’t incredibly complex, I know, but I expect it to work out decently well. Alleged steroid users tend to have the largest disparity between public and private voters, and since we are pretty much past that era in terms of first ballot players on the Hall of Fame ballot, we can reasonably expect that the difference between public and private voters will be marginal.

Consider Chipper Jones, who was a first ballot candidate last year. He received 98.4 percent of the pre-announcement, public votes; 97.1 percent of the post-announcement, public votes; and 94.3 percent of the private votes. Weighing these properly under the 60-15-25 system that I had previously established, we would expect Jones to finish with 97.18 percent of the vote overall. He finished with 97.2 percent.

I applied this simple method to all of the first ballot players, and all it really did was adjust their pre-announcement, public vote percentage by 0.4 percentage points.

The Confidence Intervals

The last thing that I calculated was an 80 percent confidence interval for each player. I decided on the 80 percent confidence level mainly because this is what FiveThirtyEight uses for all of their elections projections.

While an 80 percent confidence level is not as accurate as a 95 percent confidence interval, per se, it decreases the amount of variation that we would expect to see. A 95 percent confidence interval would not tell you a lot if it projects a player to have a final vote total between 50 and 90 percent, even though you would be 95 percent confident that their final vote percentage would fall in that range. An 80 percent confidence interval may adjust that range from 50-90 percent down to 60-80 percent, a much better tell of where a player’s vote total may fall, even though it is less confident.

In order to determine the confidence intervals, I weighted my sampling distribution standard deviation, being sure to remove the votes that have already been cast. You can’t have any variance among voters who have already casted their votes, as you know exactly how those votes will pan out.

Final Thoughts

As I stated in my piece from yesterday, when running this model for last year’s players, it predicted a player’s total vote percentage within 2.7 percent accuracy. I would love it if I could have similar results this year, but for now I am staying on the more cautious side and hoping for within 5-7 percent accuracy. We will see how it does, and I’ll be sure to fill all of you in after the announcements are officially made in late January.

Now, I’m going to leave you on this... a visual rendition of my model on a whiteboard in a high school classroom:


Devan Fink is a Featured Writer for Beyond The Box Score. You can follow him on Twitter @DevanFink.