The Hall of Fame election results will be announced Tuesday at 2 p.m. EST. With so many qualified candidates it's almost guaranteed that one or more deserving candidates will be left out, although it doesn't have to stay that way if voters are willing to add more players to their ballots as I discussed last week. This post predicts not just who will be selected tomorrow, but with what percentage of the vote.
Tom Tango placed a link on his site to a ballot tracker from Ryan Thibs, who has been collating ballots that have been publicly announced since 2009, and with his permission I used this data to predict what I think this year's results will be. I used information from 2009-2014 to see how well the vote of the ballots that were made known prior to the announcement (hereafter referred to as sample ballots) squared up with actual results. This is a screen grab of a Tableau data viz:
This shows the vote for a given player on the sample ballots on the horizontal axis and the actual vote on the vertical (scrolling over the individual data points on the data viz shows more information). Data points highlighted in red represent occasions where there was a split in the vote of more than ten percentage points, or outliers -- for example, in 2010 Roberto Alomar received 89.1 percent of the vote in the sample ballots but only 73.7 percent of the actual vote (he's the red dot near the upper right). In general, the model works well at the high and low ends with more variability in the middle -- for example, it was no surprise Greg Maddux received the votes he did in 2014, nor were many of the candidates who received low vote totals much of a surprise. The variability was in players like Craig Biggio and Jeff Bagwell, those right on the cusp of enshrinement.
The very legitimate question becomes whether these sample ballots are truly representative of the vote as a whole or themselves outliers that can skew the results. For example, assume a voter selects a player with zero chance of enshrinement and that he's the only person listed on his ballot. This year, with the number of highly qualified candidates, this is almost unconscionable, and if I were that voter I'm not so sure I'd make my ballot public. Likewise, a voter can do a "good guy" vote, voting for a player he covered to whom he wants to give an "attaboy" as a final gesture. In a year where every ballot slot is precious, this might be a luxury, and again one I might not necessarily be willing to share.
I suspect these types of ballots are the extreme minority. As the graph shows, the sample ballots correlate well with the final vote, and using the 2009-2014 data yields the following equation:
Predicted voting percent = (.899626 * sample pct) + .0324881
So what could the 2015 results be using this model?
Bold percentage = elected
As of 5:00 p.m. Sunday CST there were 133 sample ballots, over twenty percent of the total expected ballots. The first column shows the actual vote percentage of the sample ballots and the second the projected vote total. Both Randy Johnson and Pedro Martinez have been listed on almost every ballot, and in one case the voter explicitly stated he left them off in order to use his votes on other candidates with the full expectation that his vote wasn't necessary to guarantee their enshrinement. Were Johnson to maintain his percentage, he would supplant Tom Seaver as the player to receive the highest percentage of votes in Hall of Fame history. I suspect both Martinez and Johnson will have a slightly higher vote percentage than the model predicts.
Craig Biggio and Mike Piazza will quite likely have a sleepless Monday night, because it will come right down to the wire for both of them, and on Sunday as I was waiting for the last possible minute to update this with the most recent information, thirteen additional ballots came in that pushed Piazza out. Even so, if the results hold up as I suggest, the class of four players enshrined by the BBWAA would be the largest since 1955.
As I was writing this post I was having a discussion with Tom Tango that spilled over from Friday evening into Saturday morning in which he made this point (among many others):
@ScottLindholm For BEST 5-year period, there were probably 12-15 players elected by BBWAA. So, you can't expect them to elect more than 3/yr— Tangotiger (@tangotiger) January 3, 2015
He's right -- the most players selected in a recent five-year period was ten (1999-2003) and several spans in the 1980s, and one has to go all the way back to the 1960s to find as many as twelve or thirteen. This chart shows how often as many as three players have been enshrined in a given year:
|1936||5||Babe Ruth, Christy Mathewson, Honus Wagner, Ty Cobb, Walter Johnson|
|1955||4||Dazzy Vance, Gabby Hartnett, Joe DiMaggio, Ted Lyons|
|1947||4||Carl Hubbell, Frankie Frisch, Lefty Grove, Mickey Cochrane|
|2014||3||Greg Maddux, Tom Glavine, Frank Thomas|
|1999||3||George Brett, Nolan Ryan, Robin Yount|
|1991||3||Fergie Jenkins, Gaylord Perry, Rod Carew|
|1984||3||Don Drysdale, Harmon Killebrew, Luis Aparicio|
|1972||3||Early Wynn, Sandy Koufax, Yogie Berra|
|1954||3||Bill Dickey, Bill Terry, Rabbit Maranville|
|1939||3||Eddie Collins, George Sisler, Willie Keeler|
|1937||3||Cy Young, Nap Lajoie, Tris Speaker|
That's not a very long list.
There are no guarantees for anyone (well, outside of Johnson and Martinez), and as the ballots are tabulated and stories told, we'll hear all sorts of interesting things. Buster Olney and Lynn Henning have chosen to abstain from voting (links go to the stories), both on the premise that since there aren't enough slots to allow them to vote for worthy candidates, they'll abstain and not harm the chances of induction for anyone.
It appears around twenty players will receive votes over the five percent threshold, making for a total of around 200,000 possible combinations of ballots. Weighting players with better chances can decrease that number dramatically, but even still, with so many qualified candidates it's unlikely that more than five will be selected -- there are simply too many possibilities for all the votes to coalesce around many more players than that.
2014 already demonstrated a willingness by voters to place more players on a ballot, and Biggio was a couple votes away from making the Class of 2014 the first one in close to sixty years to have four players. Voters are showing a willingness to add more names to their ballots, placing over three additional names per ballot since even as recently as 2012. Tango could be right and voters could revert to historical norms of not electing more than two or three players a year, or this can be a shift and pave the way for a Hall of Fame that's more representative of Expansion Era baseball in which the best players in the world are competing. It's too soon to tell, but the sample ballots so far show a willingness to adapt and change.
I don't often ask for favors, but please send Ryan Thibs a tweet and thank him for compiling this data, because I don't have a post without it. Writing about the Hall of Fame is one of my favorite things to do, and his data allowed me to look at voting patterns in a new way, and that means a lot to me.
. . .
HOF voting data amalgamated from Baseball-Reference. Thanks also to Tom Tango for the spirited discussion on this topic. I was a debater in high school and college (to the eternal joy of my opponents) and truly enjoy a thoughtful give-and-take.
Scott Lindholm lives in Davenport, IA and would have voted for ten players if given the opportunity. Follow him on Twitter @ScottLindholm.