For those missed it, as a part of our saber-education series here, I interviewed Mark Simon from ESPN this past week on his perspective on making sabermetrics and writing / reporting mesh. You can find the first part here, where we discuss his general approach to incorporating statistics into articles. In this section, we delve into more specific areas of discussion.
Steve Slow: Are there any particular sabermetric concepts that you want to impart to new readers? Or in other words, what's the most important concept you'd like new readers to grasp?
Mark Simon: It's more about the idea of being open-minded to the information than a specific stat-encouraging people to learn what a stat is rather than to be intimidated by its name.
There was a time when I was convinced that wOBA was too complicated for me. But then I did the reading on what exactly it was - it's an on-base percentage in which home runs, triples, doubles, singles and walks, each have different worths, but not the values they have in slugging percentage. That's not complicated. It's just different.
A lot of the time people make a prejudgment that what's unfamiliar is automatically labeled complicated. How do you strip them of that belief? You start by explaining it to them in a manner using examples to which they can relate.
SS: How receptive are your co-workers to new statistics and concepts? Have they ever asked you about the statistics in your articles, and if so, how do you introduce them to saber-stuff?
MS: There is a much greater open-mindedness to this kind of material than there was a few years ago because of the recognition of its value in the sporting world. We've made it much more a part of our coverage in Stats and Information (my department), so everyone in our department is receptive at different levels.
We've done a lot of internal education too. I've given talks on things like The Fielding Bible's Plus-Minus system and on Win Probability Added, and I usually start them with "Did you see that play last night? We can use that as an example..." Since we all know the top stories each day, that makes it easier.
And we now have a unit of several employees specifically devoted to developing statistics and putting existing "Next-Level" statistics to greater use - our Analytics team. You'll probably be hearing and reading more from them in the near-future with regards to all sports.
Among the on-air Baseball Tonight talent, I've talked to all of them about these things at some point. Some like it and some don't. There's a diverse group of opinions there, as you can imagine (we had a pretty good split of opinions on whether Sabathia or Hernandez should win the Cy Young). And that's understandable. Players are going to look at the game differently.
I can tell you that one of my favorite moments this season was when I had a talk with one of our player analysts about the article "Jeter vs Everett" (written 5 years or so ago) and he said "I thought it was the most fair analysis of Jeter's defense I've read."
SS: In your experience, what's the toughest concept / stat for someone to grasp?
MS: This is a two-pronged answer. This will sound a little silly, but when I try to explain a stat to my colleagues, the ones I have trouble getting through to them on are the ones with odd-sounding names. Bill Simmons wrote about this in the spring. I actually find that a lot of people are distracted if the stat has a weird name like VORP or a lengthy acronym.
On an individual note, the one I had the hardest time "selling" to our on-air folks on Baseball Tonight is xFIP. We used it a few times, but I think it's hard to convince players (especially position players) that a pitcher has only a limited amount of control over his hits and runs allowed. I've tried to go real basic with it too ("What is it? It's me and you looking at two pitchers strikeout, walk, and fly ball rate numbers and us discussing which guy has the better combo") with a little success, but not to the same level as some others. But we continue to work on it.
SS: I definitely agree on both points - I think it says something that despite VORP falling out of fashion in the saber-community, it's still the number one target whenever someone goes to task on sabermetrics. It sounds weird, making it an easy target.
You bring up a great point about xFIP: pitching statistics and DIP theory is a really tough thing for many people to grasp. Have you found that FIP is any easier for people to digest?
MS: I actually have put the emphasis more on xFIP because every article I've read prefers xFIP ...and as a funny aside, I put out a call on Twitter one day that said something like "We've done sabermetrics 4-5 days in a row on Baseball Tonight. What should we do today to keep the streak alive?"
I got 6 or 7 replies back and xFIP was mentioned the most, if I remember right.
I think I'll take another shot at it this year. I really think you could just put up a graphic that has each pitchers K, BB and HR allowed (or fly ball rate) and just ask "Who would you rather have?" Then say, ok, here's their ERA-equivalents...we call it FIP (or xFIP) and people might get it.
I should mention too that we used Adjusted ERA+ a few times and internally, people seemed to pick up that one quickly.
SS: Also, there's so much disagreement in the saber-community about how to best evaluate a pitcher, I feel like it's tough to convey that uncertainty to a general audience. People like their hard and fast answers: Player A is better than Player B because this one stat says so. Do you this uncertainty hurts sabermetrics from a public-relations perspective? How do you convey uncertainty and debate, yet not turn people off towards the statistics you're using?
MS: It can.
As an aside, and this isn't sabermetrics...sometimes we struggle with uncertainty as it relates to pitch type ...Pitch F/X says it was this type of pitch... a "video scout" from a stat service says another ... and then the player says something else. That sort of uncertainty is one we deal with a lot internally and we're trying to figure out how to best deal with it.
Sabermetrically, we've brought it up as it relates to something like small sample sizes ...we write "suggested copy" for our hosts with our graphics ...and I'll throw whatever caveats are needed in ..."Keep in mind it's a small sample" is a common last sentence.
To an extent, as the writer/researcher etc, you're the "watchdog" and you have a responsibility to use the stats responsibly ...so if you put something out there that says Joe Blanton is better than Cliff Lee, you better be making it clear...ok, but we're just talking about for this 6-week stretch, and the larger samples tell us he's not.
SS: Can journalism and education go together, or are the two diametrically opposed? Since reading more baseball writing, I feel that some writers shy away from advanced statistics because they don't want to impose upon their readers; their role as a journalist is to report and inform, but not necessarily to venture into new ground and educate. Personally, I believe that the role of a writer is to inform and educate others, but I also never went to journalism school. If I expect local newspaper writers to educate the populace about their local team (for example, like explaining why certain players may be over- or under-valued), am I expecting too much?
MS: In this age, they can absolutely go together. When you provide information, you're educating them. If there's a new development in baseball statistical analysis that helps explain something better or more clearly, I think the journalist should bring it to the attention of his readership/viewership and challenge them to grasp it. How are we going to grow and develop otherwise?
This qualm was raised in the comments on Section One by JinAZ:
The issue I run into with him is that so much of what he brings to that podcast is the sort of small sample size splits trivia that drives me nuts. You know: "so-and-so hit better this year with runners on scoring position at home during day games than night games!" Maybe it's just the audience, because I know people like that stuff. But that kind of thing is almost always not predictive. And yet it is often presented as if it could be predictive-or, at least, people take it as if it is predictive. I would prefer that those kinds of splits never be mentioned, because they are flat out misleading most of the time. Or if they are mentioned, they be accompanied by a statement that they have no predictive value.
If we're talking about fan education in this column, ultimately, I think most non-saber fans with critical thinking skills can tell that those splits probably have no value. And from that, they may think that sabermetrics is a load of crap.
MS: Hi JinAZ ... Im glad you brought both points up. First of all, the Stats and Info blog is now free. We originally started under the Insider label, but we are now available publicly for free. So I hope you'll become a regular reader.
Secondly, and I've talked about this a bit with my colleagues and exchanged e-mails on the subject with other sabermetric folks ... there's a difference between something being of predictive value and something simply being a "cool note.
"I LOVE "cool notes" and I try to distinguish between those and those that are meant as predictive value. If the Mets are 13-0 in home games on a given night (as they were this season), that has no predictive value whatsoever. But it's the kind of thing that if you're listening, you say "Wow...that's cool!" or makes you laugh.
I do think there is a segment of the audience that likes that sort of stuff, and there's the entertainment factor that comes with it...and I TRY to always caution when the sample size is small. I try to appeal to all types, and hopefully everyone takes something out of what I bring.