Chicago Sun-Times's columnist Rick Morrissey recently interviewed new White Sox pitcher Jeff Samardzija for a column in which Samardzija made some interesting statements about the use of data in baseball and his apparent dislike of it:
"Sabermetrics, nyeh. Sounds like a lot of hot air,’’ Samardzija said, smiling. "I think there are definitely positive aspects to it. I think there is some information you can take from it that’s important. But ultimately from a player’s point of view, you want a coach that can relate to you. Can help you with adjustments mid-game.
In the year of our Lord 2015, that's an interesting statement to make. It's a difficult proposition to make that more data hasn't led to an increased understanding of facets in the game in which changes can yield positive results. Samardzija continued:
"I think preparation with numbers and stats and all that’s great, but when the bullets are flying, you need a guy that knows your personality, can relate to you and get you to change or fix what’s going wrong. If you don’t respect the guy that’s telling you that information, you’re not going to listen to him.’’
This begins to make the distinction between data and implementation. I'll use an extreme example to make my point -- in his new role with the White Sox, Samardzija will have the privilege of facing Miguel Cabrera and the Tigers 18 times a year. Samardzija has a tendency to get lit up when he pitches in the strike zone and has more luck when he pitches higher or lower in the strike zone, whereas Cabrera absolutely murders pitches in the zone, and is pretty dangerous on those in the lower part of the zone. Is it important to have this knowledge when facing Cabrera?
There's an argument to be made that no data is necessary to prepare for Cabrera -- he's been one of the best hitters in baseball since he entered the league, and his first 12 years rank among the best starts in baseball history. Does Samardzija really need to clutter his head with ephemera when pitching to him, or is he better suited to do his best to give him absolutely nothing to hit and live to see another hitter? Samardzija's statement shows he understands the distinction between preparation and execution, but it's the preparation that helps establish ways to pitch to Cabrera that won't end with Samardzija getting whiplash snapping his head around watching a ball leave US Cellular or Comerica Park.
In every facet of life, preparation occurs before implementation and is practiced so that it becomes second nature. In my former life as a pharma sales person, I was sent to our headquarters in North Carolina for three weeks of training on our products. I spent that time learning every facet of my products -- dosing, side effects, price, third-party reimbursement, personality profiles of physicians and how to best interact with them, market shares, and then spent countless hours honing my 30-second to one-minute presentations. When I got back to my territory, if I had paralyzed myself with this type of preparation before every interaction with a physician I never would have said a word.
That's how data should work in baseball, and Samardzija touches on it when he states the importance of working with a coach with whom he can communicate. One of the most misunderstood aspects of the data boom in all sports is that there is arcane math requiring a Stephen Hawking-like mind to decipher. I won't state this isn't the case sometimes, but using PITCHf/x as an example, at its core it's the gathering and measuring of data. This in no way diminishes the work Dan Brooks and Harry Pavlidis do, but the gathering and processing of data is subordinate to the effective use of the insights it uncovers. Samardzija should never take the mound with the express intent of pitching to Zone 25 if the count goes to 3-2 in a tie game as much as understand the value of keeping the ball away from dangerous hitters. It's safe to say he already knows this.
"So much of the game happens so fast that you’ve got to trust yourself and your instincts and trust what you remember before from facing guys,’’ he said. "You go off that. I think a lot of money is wasted in Sabermetrics, in producing information and hiring people to produce information. If it’s not being taken from the paper and processed by the player, it might as well just be a waste.
The second part of this quote is almost totally contradicted by the first. Trust and instincts can be developed through the use of data, and that data can be presented in whatever manner is most conducive to a given player or coach. Again, another lesson applicable to every stage of life--communication only occurs if all parties are on the same page and understand the concepts and the manner in which they're presented.
Morrissey added a quote from White Sox manager Robin Ventura:
"You understand how pitchers are pitching you, you understand where teams are playing you, you understand where other guys are going to hit the ball,’’ Sox manager Robin Ventura said. "It gives you an extra step. I think your anticipation is a little bit better of what might be happening before it happens.
If used correctly, this is what data accomplishes -- it identifies areas in which an advantage can be exploited. A pitcher shouldn't take the mound formulating a plan as much as executing one, having already internalized his approach to the hitters he'll face. He doesn't need to know the guts of the algorithms or the wOBA weights in order to know that throwing a fat pitch down the middle of the plate rarely results in a positive outcome.
Analysis is a tool, and tools are used at the discretion of the user and not the other way around. Every time I've held a hammer in my hand I've never heard it tell me how to use it -- that's my job. Data is simply another tool that shows the results of lots and lots of prior occasions, and to not make use of past outcomes to plan for the future is to not use all available tools.
The battle over the use of data ended the first time in baseball antiquity someone had the bright idea to measure outcomes. The New Bill James Historical Baseball Abstract (p17) states:
Statistics made available by the National League in 1879 were games played, at bats, hits, runs, batting average, average runs per game, times reached first base (which apparently included reaching by an error or forceout), on-base percentage, putouts, assists, errors, total chances, fielding average, passed balls for catchers, batters facing pitcher, runs allowed, average runs allowed (per game), hits allowed, opposition batting average, walks, average walks per game, and wild pitches
All that has occurred since then is increased measurement and refinement of these numbers. The best analysis downplays the numbers, sometimes doesn't even refer to them and instead focuses on what the numbers illustrate and works to factor the analysis into a game plan. Samardzija already incorporates reams of data every day, and always has throughout his careers in two different sports -- he probably just doesn't realize it.
* * *
Scott Lindholm lives in Davenport, IA. Follow him on Twitter @ScottLindholm.