Hey folks, right now this is mainly a placeholder for a series preview post that will be finished up after my little one goes to sleep tonight. I'd like to start a thread like this each week where we preview one series from the upcoming week using a some sabermetric principles and a simple simulation tool I've been working on developing. This week I picked the Yanks-Sox series, but I think in coming weeks I'll let you, the reader, vote on the series you want to see previewed. Ok, enough introductions. To hold you over until I get the full preview I present a graphic courtesy of our resident graphic guru Justin Bopp
The graph compares the lineups of the two teams using my home-brewed version of an updated CHONE projection. Sorry to have to cut this short right now, but feel free to use the comments as an open discussion thread for all things baseball, and I'll get back with the rest of the preview in a bit.
Ok the rest of the article will describe what the simulation tool does, and then look at the inputs and outputs for the Yanks - Sox series.The first goal of the process is to take a team's lineup and compute an expected runs scored / game. To do this I calculate Runs Created using wOBA (those Justin graphed above) and scale it to a single game using plate appearances per lineup slot. You could also use a lineup simulator to get runs scored/game. The next part of the equation is runs allowed per game. Clearly runs allowed has two components, pitching and defense. For defense I took my defensive projections if available for the player in question, if not I used CHONE as most of the unavailable players in my data set are catchers or young players. The following two tables summarize both the offensive projections used in Justin's chart and the defensive projections used for the two teams
and the Sox
For the pitching side I take my home-brewed version of an updated CHONE ERA and prorate it to inning pitched / game start for the stater and the remainder of the game to the bullpen (aggregated over the top 5 relievers). Once I get the game ERA I back that out to "theoretical" total runs allowed without defense. After subtracting the defense I'm at actual runs allowed per game. The pitching values look like the following for this series
and for the Sox
Now I have runs scored and runs allowed for each team in a vacuum. That means I can run each through pythagenpat and get theoretical winning percentages (again in a vacuum). To get actual game win percentages I plug the individual winning percentages into the log5 formula. Once I have those, a simple monte carlo simulation can provide a distribution of series results (as could just plain old statistics using the binomial distribution, but I love me a monte carlo for some reason). The results for this series (given the above projections of true talent combined with HFA for the Red Sox) are
So according to the above projections the Red Sox have a slight edge (some of which is due to having the home field advantage adjustment) but it looks like a tight series in that either team has a decent chance of winning the series. Next time I do one of these I'll go more in depth on how I'm doing my home-brew updating of projections (unless I do the smart thing and just use ZIPs updated and therefore don't have to update myself. I just had CHONE handy when I did the first one of these)