clock menu more-arrow no yes mobile

Filed under:

Daily Box Score 7/21: Superior Experimental Design

You've got links, we've got a comments section. Put them together, and you've got a wonderful thing going. Won't you participate?

There's something deeply satisfying about a well-designed study. A solid methodology, implemented rigorously, nearly always leads to useful information. That was what struck me as I read Sky Andrecheck's work on whether hitters and pitchers work the count effectively.

He took the context-neutral data of what percentage of pitches were strikes, balls, foul balls, or balls in play. He then ran a simulation that provided a context-neutral control for three outcomes: walks, strikeouts, and balls in play. For example, his simulation predicted that in a full count, a ball in play would result only 36.6% of the time. He then compared that simulation control against the actual outcomes observed in games. Using the example from above, on full counts, a ball in play in fact was the result 42.9% of the time. What does this tell us? He concludes:

It's interesting to find inefficiencies in how pitchers and hitters handle the count, and it appears that the extreme counts of 3-2, 3-1, and 0-2 expose the most inefficiencies. Unfortunately, though we can see who gains an upper-hand due to knowledge of the count, it's hard to tell whether the reason is that the players are not adjusting properly due to a lack of strategy or simply have an inability to adjust.

I highly recommend the entire article.

Who would win: a team consisting of the members of the 3000 hit club with the fewest home runs (in the Retrosheet era) or a team of sub-.250 hitters who weren't in the Hall of Fame and had high OPSes? Kerry Whisnant is running the simulation, a genius bit of imagined baseball. Batting atop the high-average lineup are Wade Boggs, Rod Carew and Tony Gwynn. Batting first for the low-average team are Jimmy Wynn, Howard Johnson and Gene Tenance. While he won't release the winner until next time, 

It is not a trick question, i.e., there was a clear winner, although the difference was not large – the better team projected to win about 82.3 games to 79.7 for the loser, if they played a full season’s-worth of games against each other.

Hit the link for the full lineups and each player's statistics. Who do you think won? I'd take the 3000-hit team by a hair.

I make no secret of my love of Russell Branyan, so I've been personally hurt by those who have been saying that his Cinderella season is coming to an end (.180/.288/.426 in 73 July PA). Though I was dejected, Dave Cameron was there to pick me up and set me right! The culprit? That old rascal, regression to the mean:

His walks and strikeouts are barely up and his power is very slightly down. Over 70 plate appearances, we’re talking about basically no difference at all. And, the extra strikeouts are actually just due to some coin flip calls by the home plate ump – his contact rate (69% in July) is higher than it was April-June (67%). There’s literally nothing to worry about here – Branyan’s slump is just normal BABIP variation. He got some good bounces in April and May and he’s got some bad bounces in July. He’s the exact same player he was, and reacting to the results will simply lead to making a bad assumption about what’s going on.

Phew. I feel so much better now. 

Of course, there are some things that are stable and predictable. Like strikeout rates for pitchers. They're a sort of elementary particle of baseball analysis, aren't they? Indivisible, component statistics that we can use to build up more complex statistics? Wrong. Over at DRaysBay, FanPoster matthan has split the atom K%. Using pitch outcomes, he has figured out a way to predict strikeout rates with an impressive r-squared correlation of .914.

Here is the formula. I rounded it to make it a bit easier:


The numbers hold up well even for pitchers with only 30 IP. On the Rays, JP Howell is king.

Sabermetrics is really quite advanced. It's been around for decades and has evolved a tremendous amount over the years. So much so, in fact, it may require a college course to understand it all. If you're a student at Rice University (go Owls), you have a chance to take just such a class: SABR 101. You can find the reading list here, which is useful even to those who can't enroll. (Tango provides a better version of his Derek Jeter article here.) The class even has Larry Dierker as a guest speaker and a class-wide fantasy league. I'm jealous!

We've had some animated discussion of Social Anxiety Disorder here recently, so you might be interested to read this article at describing the disease as it affects baseball players. One doctor, however, argues that placing players like Zack Greinke and Khalil Greene on the disabled list is exactly the wrong thing to do:

Lans and Ferraro agreed that a relationship of trust with a trained psychiatrist or psychologist is crucial for proper diagnosis and treatment. And that anti-anxiety drugs are, at best, a last resort. "A professional psychologist does one thing and one thing only," Ferraro said, "and that is give them time to talk. Listen. On a regular basis. Let them ventilate."

To Lans, putting a player on the disabled list "makes no sense at all. In the general public, when you see a very depressed person, you don't put him in the hospital; that makes it worse, because then he'll think he's crazy. At all costs, the guy should be kept at work.

Incidence of the disorder has certainly spiked recently, and a cause for the spike is yet unknown.

How does the UZR sausage get made? How about Plus/Minus? Tango summarizes an interview with John Dewan and team conducted at Sons of Sam Horn comparing the two stats. An excerpt:

On offense I believe we’re measuring 80-90 percent of the true ability of players. On defense, I believe we’re at about the 60 percent level. But we’re still at the tip of the iceberg in terms of precision and a ton more can be done, especially defensively. As new forms of data become available, we’ll be able to enhance our defensive systems. One example: BIS has now developed a batted ball timer, which we believe will greatly improve the accuracy of our system.

Sounds exciting. I look forward to the new frontier.

Some records are best left unbroken. Take, for example, the ignominious milestone Jamie Moyer approaches:

The seventh hit he allows Wednesday night (or from this point forward) will be the 4000th hit allowed of his career. He'll be the 39th pitcher in baseball history to reach that mark and the first since Tom Glavine in 2006. Behind Moyer, there's no one even close to 4000.

You certainly have to log a lot of innings to get that one. The Washington Nationals are 26-66, which puts them 40 games below .500 for the first time this season. But they're not the fastest to dig a 40-game hole:

Question: what is the [earliest] a team has fallen so far under .500? 

Answer: a tie between the 1876 Cincinnati Reds (not to be confused with the current franchise - this one collapsed in 1880), and the 1884 Kansas City Cowboys of the ill-fated Union Association (which was only nominally a major league anyway). Both squads began the year 7-47.

That's a .130 winning percentage. Woof.

Finally, for some comic relief, I offer up a guy who takes Wiffle Ball extremely seriously:

"No," he replies somberly. "It was a war, man. It was a war."

The league commish even keeps detailed stats:

But in the end nothing is as essential to the league as the statistics Gallaway carefully tabulates each week -- everything from batting average to ERA to slugging percentage. This is the thread that holds everything together, the lure that keeps players coming back season after season regardless of ability, even as jobs and time restrictions make it harder and harder to come out for Sunday morning Wiffle ball. Everybody loves to see their name on a list of statistics, even if the numbers say they aren't very good.

The league ace has a 0.58 ERA, but he is not content to stop there:

He has been working on a screwball to go with his slider and knuckleball, he says. A new pitch that he described as "filthy," one he is sure will be ready by the start of the fall season.

"I will be flat-out unhittable," the most competitive player in the Potomac Wiffleball League says most assuredly.