clock menu more-arrow no yes mobile

Filed under:

The Most Biased Fans in Baseball (Are Not Who You Think)

Last week, I introduced a new novelty statistic called Star Power as part of an objective framework for the MLB All-Star vote that accounts for empirical, often-illogical contrarianism (surely there are people out there who think Daric Barton is better than Adrian Gonzalez) but not hometown biases or systematic over- or underratedness. As a reminder, the formula for Star Power is (numbers taken at the All-Star Break):

Star_power_medium

and a player's "expected Vote Share" is his Star Power divided by the sum of each candidate's SP's in his category. So, for example, the xVoteShare for a non-outfield AL player (Player n) would be:

Xvote_share_medium

With this in our toolbox, we can compare the actual 2011 All-Star vote totals with the expected results to find the impact of each team's fans' biases on the balloting. To do this, I invented another tchotchke statistic called "BIAS In All-Star voting," or "BIAS" for short (the acronym is part of the name, like a guy looking at a picture of himself looking at a picture). Essentially, it's the number of votes a team's players received (using my estimates for players whose totals were not released) divided by the number of votes my Star Power-based model projected, then scaled like OPS+ so a 100 BIAS indicates a normal amount of homerism, while a 110 BIAS means the team's fans are 10 percent more biased than the league average. A 90 BIAS doesn't mean fans are unbiased, just that their biases are 10 percent weaker than average.

Is your curiosity sufficiently piqued? Here are the numbers for each fanbase's BIASes: (click to embiggen)

Bias_by_team_chart_medium 

Anything surprise you here? Red Sox fans, notorious for their fervor, are relatively objective with their All-Star ballots according to BIAS. Cubs faithful, famed for their loyalty, are among the least blindly supportive of their players. Even the Yankees—the kings of media domination around whom the TV world revolves—fall well short of having the most biased fanbase. Meanwhile, Braves and Mariners players got more twice as many votes as they should have? Sure, the Phillies are a big-market club and the Giants and Rangers just made the World Series, but when did Brewers and Reds fans become so passionate?

If you look at some the individual players and how many votes my framework projected for them, it starts to make some sense. Sure, J.D. Drew (1.86 million votes over his projection) and Carl Crawford (3.13 million over) got too many votes, but at the same time Adrian Gonzalez (4.02 million under his projection) and Jacoby Ellsbury (4.61 million under) didn't run away with starting spots like they were projected to. Cubbies fans didn't exactly stuff the ballot boxes for Starlin Castro (estimated .78 million votes too low) or Aramis Ramirez (2.11 million under)—they weren't necessarily deserving of votes, but given Castro's popularity and the dearth of good NL third basemen this year it's safe to say fans underperformed with them—and in the Bronx Alex Rodriguez (2.48 million under) wasn't supported like the clear best third baseman in the league that he was.

Meanwhile, Atlanta fans helped to give all eight Brave players more votes than they were projected to receive, including, most egregiously, Chipper Jones (2.00 million over) and Jason Heyward (1.44 million over). Ichiro (2.44 million over) was the only Top 8 finisher for the Mariners, but his overratedness combined with the rest of his teammates' poor play was enough to give Seattle fans a high BIAS. Casey McGehee (1.77 million over) and Yuniesky Betancourt (1.65 million over) can tell you all you need to know about Milwaukee fans' passions, while Jay Bruce (2.20 million over) and Paul Janish (1.11 million over) were the biggest beneficiaries of Cincinnatians' fervor.

Another interesting thing to note: the three teams whose bars appear in black above (the Marlins, Padres, and Athletics) have BIAS scores based solely on my replacement-level estimations because none of their players cracked the Top 8 vote-getters at their respective positions. No Fish or Friar or A even made the list for outfielders, where the Top 24 candidates are revealed. But that's not the saddest part—based on my meager 457,546-vote minimum estimate for bottom-finishing candidates, Oakland players still got 28 percent too many votes. It's a rough time to be an A's fan.

So where do these biases come from? Local demographics don't seem to have much to do with it. BIAS scores and the populations of teams' cities and surrounding metropolitan areas actually had very slight negative correlations (R2=.011 and .002, respectively)—right around the significance of alphabetical order (R2=.005 for team names, .010 for team locations). Interestingly (or maybe not), teams' BIAS scores and the vote percentages Barack Obama received in their states in 2008 had an inverse correlation with R2=.021.

It seems that partisanship comes not from external factors but from the teams themselves. The age of a franchise correlated with BIAS scores for an Rof .103. Payroll (R2=.088) appeared to have some small impact, while this year's attendance (R2=.120) didn't do as well as I would have thought. Preseason expectations might have something to do with it: The correlations between BIAS and 2010 wins (.080) and PECOTA's February projections (.082) were at least enough to be interesting.

Overall, though, the clear most important factor is current team performance. BIAS and wins at the All-Star Break correlated at R=.466 for an Rof .217 (some of the above relationships are probably just functions of this). That's still not enough to explain the bias, though—there is a large intangible je ne sais quoi in play here that I cannot quantify (though given that I think most would put Boston and Chicago fans near the top if they compiled a gut-feeling list like this, I'm not sure whether anecdotal observations would really help much here).

It may be that, like UZR and BABIP, BIAS scores need more than one season's worth of data before they stabilize. This year's totals correlated with the results of the (much more simplistic) study I did last year at R=.370, so the leaderboards aren't a model of stability. Then again, if current-season performance is the biggest factor getting fans to the All-Star polls, perhaps the inconsistency is not a failing of the statistic but a reflection of changing real-world conditions. Interestingly, some of this year's surprises placed similarly last year, too—the Braves were near the top, while Red Sox (90) and Cubs (56) fans got almost exactly the same scores.

It's worth noting that some teams' scores are artificially inflated by having overrated, popular players on their rosters who other teams' fans voted for too: the Mariners wouldn't be at the top if not for Ichiro, and Twins fans would be below-average without Joe Mauer. I considered dropping each team's biggest outlier in each direction to balance it out, but that seemed unfair to the teams who don't have a Mauer or Ichiro. Especially since much of their popularity is tied up in their team—even outside of Minnesota, Mauer probably wouldn't have gotten as many votes if he had left as a free agent last winter.

One final thing to keep in mind: having a biased fanbase isn't necessarily good or bad. If you care about preserving the integrity of the All-Star Game then no, you really shouldn't be voting for Derek Jeter or Jarrod Saltalamacchia. At the same time, I don't think this speaks well for the fans on the lower end—how depressed does a fanbase have to be to not get a single player on a Top 8 list? A little BIAS is healthy. Just take it in moderation.