It's so easy to take things for granted -- our families, spouses, children, and so on come immediately to mind, but also assuming the wealth of baseball data available not only was always there but was archived with no effort. When thought about for any length of time the obvious conclusion is reached that someone had to do the heavy lifting to make that information available. When those brief moments of inspiration and lucidity occur, it's nice to acknowledge those people.
Retrosheet is the spine upon which most modern baseball databases are built, the core box scores and play-by-play descriptions that are the basis of modern measures. More information on their history can be found here, and my point isn't so much to proselytize for the work they do (even though I should) as much as acknowledge and thank them for doing it. Without the compendium of information they've compiled, I (and many others) can't do the type of analysis I do.
Some time last year I stumbled across a section I had never seen before titled "Umpires". In addition to games and position I saw something I'd always been interested in but had a hard time finding, let along quantifying and analyzing -- data on ejections and reasons why. I've written about this in the past, and as I was updating data from the 2014 season I clicked on the page that listed the data credits:
The newest section of data now available on the website relates to umpires.
I can't define "new" in this context, but it made me feel better at least to think that this data hasn't been sitting around since the mid-1990s just waiting for me to find it.
One of the other things I've discovered in the past year is using Tableau data visualizations, and I use them not to junk up a post but because a picture really can say a thousand words, if used correctly. I use charts and graphs to illuminate and as shorthand -- in my former life as a pharma sales rep, I could discuss a clinical trial with physicians but saw the light go on if I had a chart I could reference that summarized the pertinent data. This is the data viz of all recorded ejections in baseball history, with plenty of explanation to follow:
There are seven tabs, and the first four follow the same format. The Umpires tab shows the number of ejections made by individual umpires along with the number of games umpired. I arbitrarily set the filter to 1990, but it can be moved to show as few or as many years as desired. There is ejection data going back to 1889, and it's intriguing to see how the numbers move over time.
Using Bob Davidson to explain further, between 1990 and 2014 he umpired 2,634 games and ejected 124 people, or in around five percent of the games he umpired. This is not Davidson's entire career, since he reached the major leagues in 1982 and has umpired a total of 3,681 regular season games (and 39 postseason and 3 All-Star games, but the data viz shows only regular season games) and has ejected 161 people. Comparisons can be made between other contemporary umpires in games umpired, number of ejections, and percentage of games in which they threw people out.
The next three tabs show the same information for Managers, Players, and Coaches, with the small difference being that total games are not shown for players and coaches since their relative paucity of ejections makes that percentage meaningless. Even so, there's some very interesting information lurking in there as the data is filtered and massaged. In fact, have some fun now and try to think of the player with the most ejections since around 1950 (no fair if you saw me tweet this earlier this week) and check the viz to see who it was. It was not the first name that came to my mind, but it certainly made sense.
The next tab, Ejection Reason, shows the reasons why people were ejected and also includes a filter to select the number of years viewed. The primary reason for ejections has remained fairly constant throughout baseball history, although it's probably not the reason itself as much as the language the ejectee used to plead his case to the umpire. Ejection Pct shows the distribution of ejections between players, managers, and coaches, and I'll admit I was very surprised by the results.
I added the last tab as I was writing this, Umpires by Year, which shows how many ejections umpires make in a given year. The filter used is intentionally different and shows only one year at a time to illustrate that it isn't unusual for an umpire to go an entire year (or in some cases several years) without throwing anyone out. The common perception is that modern umpires are overly concerned with showmanship and too quick to toss someone, and while there are certainly individual circumstances where this might be true, on the whole this doesn't appear to be the case.
Ejection information isn't easy to find. For example, on August 14th, 2014, Bob Davidson ejected Padres manager Bud Black for arguing a replay call. Neither the box score available from Baseball Reference or FanGraphs references the ejection, and even my guy Daren Willman's wonderful Baseball Savant lists the replay (#947 on this page -- scroll way down) without mentioning Black was ejected. I mention this not to slight any of these sites but to show just how difficult it is to find ejection information. With any luck I've created a resource that allows for deeper investigation of ejection trends, and for that I thank the kind folks at Retrosheet who unearthed and disseminated this information. For people like me who are fascinated with baseball history, it's an eye-opening experience to find information never readily available before.
All information from Retrosheet. Any errors in compiling or processing the data are the author's.
Scott Lindholm lives in Davenport, IA. Follow him on Twitter @ScottLindholm.