clock menu more-arrow no yes mobile

Filed under:

Avoiding Fallacies in Baseball Analysis

Well-constructed articles should be free of poor logic. Here are a few common fallacies to keep an eye out for when reading baseball analysis (or anything else).

Arod has been the subject of much emotionally charged (and fallacious) writing.
Arod has been the subject of much emotionally charged (and fallacious) writing.
The Star-Ledger-USA TODAY Sports

This is a meta-article, an article about reading articles. It’s important that we take time to think about how we consume analysis because, sometimes, we do it wrong (or at least badly). Baseball analysis you read in online articles at this and similar sites is persuasive writing. The author has a conclusion that he’s trying to convince the reader to believe (e.g., the importance of a statistic, the greatness of a player, the folly of a free-agent signing, etc.). It is then up to the reader to evaluate if the argument is persuasive.

One of the tools that has served me best when considering the merits of an article is the lens of fallacy identification. If a proposition is free of major fallacies and its premises are sound it is then worth pondering for significance. If an analysis contains poor logic, it’s not necessarily "wrong", but is not proven and not trustworthy.

What fallacies should the reader be looking for? Here’s a cheat sheet that I’ve had blown-up into a poster and hangs in my office. (by the way, you all should be regular readers at http://www.informationisbeautiful.net. The saber-community should strive to do more work like this). I’ll step through a few examples in a baseball context so you can "get the idea" of the kind of thing you should be looking for. Don’t focus on the details (many ideas are correct); they’re not the point. The point is the way the arguments are constructed and the reasoning tools they use. I’ll present fallacies in caricature and let you look out for living, breathing examples in the wild.

Multiple causes to a single effect

Two different fallacies occur when you assume an isolated cause-effect relationship, denying the antecedent and affirming the consequent. These both stem from there being more than one way to reach an outcome.

Denying the antecedent takes the form: If A, then B. Not A. Therefore, not B. That’s a little abstract, so here’s an example:

A strong bullpen is a recipe for postseason success. Team X has a poor bullpen, Therefore they will surely fail in the postseason.

Why is this a fallacy? Well, there’s more than one way to succeed in the playoffs. Certainly, some roster constructions are more likely to succeed than others, but there is no single method to success (e.g., great starting rotation, all sluggers, etc.). Affirming the consequent is just the opposite, it takes the form: If A, then B. Observe B. Affirm A. The example:

Player X hit well for a whole season, he must have finally healed after that power sapping hand injury.

Why is this a fallacy? Because it assumes there’s only one explanation for the player’s improvement. Instead, the player might have altered his approach at the plate, started a different training regimen, or just gotten lucky.

Secundum quid (aka Sweeping Generalizations)

This fallacy occurs when a general rule is applied as an axiom without regard to specific circumstances that might be an exception to the rule of thumb. We commit the secundum quid fallacy if we fail to recognize the qualifications and caveats these generalizations were derived under. Here’s an example:

Pitchers are worse on short rest. Outstanding pitcher X should not start the upcoming playoff game.

Why is this a fallacy? Well... this general rule derives from Tango and Lichtman who, in "The Book", showed a 0.017 jump in wOBA for pitchers on 3 days rest (a 113 pitcher sample).  The problem with the above example is not the general rule.  The problem is that it fails to observe the caveats that accompany the general rule.  The sample population is small, so the general rule is unable to control for the specific circumstances of the current situation. For example, the team could have an abnormal roster construction, or be facing an unusual platoon advantage, or exist in a lopsided (or otherwise distinct) playoff series, etc.  In this thread, Tango and Lichtman demonstrate how to avoid the secundum quid fallacy in the specifics of the 2013 playoff race as it applies to Clayton Kershaw.

This doesn’t mean we should disregard our rules of thumb. It means we need to be careful about the caveats that accompany the general rule with an eye on the uncertainty of the general rule. We should construct our expectation based on the general rule and modulate our expectations based on the unique aspects of the situation at hand.

Here’s another one:

The "break even" success rate of stolen bases is 75%. Players with a career success rate lower than 75% should not attempt a steal.

This rule was developed with an assumption about the run environment, the predictive nature of the player’s career success rate and the readiness of the defense. If the run environment of the day is much different we must modulate the threshold. Similarly, if the player’s career caught stealing rate is not predictive of the current situation (e.g., due to an injury, defensive alignment or personnel) the rule of thumb must be adjusted to avoid the sweeping generalization fallacy.

Petitio Principia, (aka "Begging the Question")

Begging the question is a form of circular reasoning where the conclusion and the premise share the same substance. That’s a little tricky to understand so we’ll start with a simplified form and then add wrinkles until it seems like something we might actually encounter.

Player X is trade bait because player X is likely to be traded.

The conclusion in this simple example, "player X is trade bait" is a restatement of the premise "player X is likely to be traded". This is so simple that it’s a "non-statement". The problem is less obvious when a second fallacy (an appeal to common practice) is added:

Player X is trade bait because teams tend to trade players like player X.

The premise and conclusion are still the same in substance, the claim that teams tend to trade similar players is debatable and not specific to any team or player profile. Begging the question takes the form: A=B because B=A. These equivalencies may be true (or not), more importantly, they are not proved by the arguments presented. The fallacy can be harder to spot if it takes an alternate form: A=B because A=C and C=B. Here’s what that might look like:

Player X is trade bait because he’s a blocked prospect and blocked prospects tend to be traded.

Both claims "he’s a blocked prospect" and "blocked prospects tend to be traded" are debatable, neither are self-evident (or the fallacy would be avoided). The fallacy would also be avoided if both A=C and C=B are subsequently supported.

A common description of begging the question is that fallacy leaves out key information. That can be seen in the above example; an explanation of why "he’s a blocked prospect" is omitted.

Let me know if these examples have equipped you to spot these fallacies as you read baseball analysis (or any other persuasive writing). If so, I’ll give you a few more.

. . .

Inspired by posts at Information is Beautiful. Read more at Wikipedia. Their articles on these subjects are well structured.

Jonathan Luman is a system engineer with a background in aerospace. You can follow him on Twitter at @lumanjonathan. You can contact him at jonathan.r.luman@gmail.com.