Winston Churchill famously said that the best argument against democracy is a five-minute conversation with the average voter. What most people don't remember is that he immediately followed up his immortal quip by interjecting, "except for the MLB All-Star vote!" (Disclaimer: Story may not be factually accurate.)
Every year millions of stadium-goers and internet surfers set out to prove that they are better than the talking heads on TV at identifying the best players in baseball, and every year they fail miserably. The balloting is still underway for the 2012 Midsummer Classic, but the first round of voting results is an excellent example of the fans' collective incompetence: Prince Fielder over Paul Konerko, Adrian Beltre over Miguel Cabrera, Brian McCann over Carlos Ruiz, and the complete absences of Jason Kipnis and Jed Lowrie from the Top 5 at their respective positions are just a few of the reasons why the All-Star Game is an odd way to determine home field advantage for the World Series.
But while a certain Prime Minister doesn't think you should have the right to vote for the players with the coolest names in the league you don't follow, I'm happy to put up with the fools who think Robert Andino and Chris Getz are more deserving of an All-Star nod than Jason Kipnis. Why? Because the All-Star vote is our only chance to actually enter the mind of the average baseball fan.
Yes, the All-Star electorate almost always gets it wrong. But why do they make dumb choices? What are the bleacher creatures thinking about as they struggle to punch the tiny holes in their ballots between innings?
In order to answer that question, I first filled in all the missing vote totals from the 2011 All-Star Game (only the top eight finishers at each position were released) with my previously calculated estimates of how much support the least popular players received. Next, I identified nine specific tools or skills a position player has that might influence fans' votes: raw power, contact ability, pitch selectiveness, speed, defense (the limitations of my data prevented me from breaking fielding down into its individual facets), hitting it where they ain't, durability, clutchness, and quality of his team.
Finally, I selected the best easily available statistic to measure each trait, took each player on the ballot's numbers as of the All-Star Break, and performed a series of correlation analyses. I made it a point to use esoteric statistics whenever possible so as to ensure that this study reflects actual conceptions of different abilities rather than specific numbers—many people probably consult the home run leaderboards when filling out their ballots, but I doubt anyone does that for Power Factor.
Below is a breakdown of what statistics I used and how strong the relationships were between players' numbers in each category and their vote totals. The R-squared value essentially represents the proportion of vote total variation that each trait explains—i.e., a 0.100 R^2 would imply that the skill in question is responsible for about 10 percent of fans' decisions.
As you can see, the single most important factor in the voting (at least, of the traits I looked at) is playing for a good team. I suspect that there is some sample bias in play here—the better a player plays, the more games his team wins—but that in itself is not enough to take away from the fact that about a quarter of the average fan's decision about who to vote for is based on the other 24 guys on each player's team.
Next on the list is durability, though I would guess that that is more about the fact that better players tend to play more and that more playing time leads to higher numbers of home runs, RBI, steals—the kind of numbers that every fan knows. One would have to adjust for a multitude of factors to get a truly accurate read here, and by that point the numbers would probably be too convoluted anyway.
The only other tool that explains more than 10 percent of fans' votes is power, which makes sense. Clubbing the ball really, really far is both an important part of hitting and exciting to watch. BABIP is also a key factor in run production, but it lacks the flash of light-tower power and its general instability weakens its relationship to overall talent. That the two biggest direct on-field factors are both about batting shows where fans' thoughts are.
Though differences in fielding and speed are generally not as important as in hitting ability, I was surprised to see how little relationship there is between them and All-Star votes. I'd posit that the disconnect between defense and voting is not actually as great as it seems, but is weakened by the imperfections of UZR as well as the mistaken impressions fans can get from basing assessments of fielding on casual observation.
Meanwhile, both contact and O-Swing rates have virtually no relationship to All-Star voting; clearly fans don't mind when batters swing at bad pitches and miss. But what I found most fascinating was that clutch has nothing to do with All-Star voting—in fact, the relationship is actually very slightly negative. As with UZR, I assume this is because most fans think of "clutch" more in terms of RBI opportunities than situational leverage. We would need a stat like Junior from Fire Joe Morgan's SCHYCRL-GHW-OH,L (SportsCenter Highlights You Can Remember of Late-Game Homers or Walk-Off Hits, Lately) to get a true measurement.
The table above also lists the coefficients and constant for a linear estimator of All-Star votes based on these nine variables. These tools (or at least the statistics that purport to measure them) explain about 48 percent of the variation in All-Star balloting, with a standard deviation of about one million votes. Here's how they line up: (click to embiggen)