The climax of the college baseball season begins this weekend as 64 teams compete in the first round of the NCAA Division 1 College World Series. Because college baseball is fun, and because the best part of any big tournament is the chaotic first weekend with wall-to-wall games, I'm writing this primer to get you, the sabermetrically-inclined pro baseball fan, off your computer and in front of your TV for this weekend's action.
How does the tournament work?
Like the NCAA basketball tournament, the College World Series starts off with 64 teams and winnows them down to a final champion. That's just about where the similarities end, though. The basketball bracket is simple: a big single-elimination tournament; win six times and you're the champion. But a single-elimination baseball tournament would be lunacy, so there are three stages to the baseball tournament:
- Regionals: Groups of four teams play double-elimination tournaments at 16 sites across the country. The tournaments are hosted by the top seeds (usually at their campus stadium). The tournaments start Friday at noon, and the last games are held Monday evening. After this round, 16 teams remain.
- Super Regionals: The 16 remaining teams are paired off, with the higher seed hosting a best-of-three series. If two number one seeds meet in a super regional, the "national seed" (one of the top eight teams as designated by the selection committee) gets to host. The eight teams left after this round meet in the...
- College World Series: Eight teams travel to Omaha (it's always in Omaha) to compete for the championship. The teams are split into two groups of four, which play another double elimination tournament. The two winners of those tournaments meet in a final best-of-three series to decide the championship.
If it sounds confusing, this bracket might help make things clearer.
Doesn't the basketball tournament start with, like, 68 teams now?
Oh, stop. Those first four games don't count. Come on, focus!
How do I follow the tournament?
All of the games are available on the WatchESPN app with a few making it onto ESPN2. ESPNU will be showing a "Bases Loaded" platform that will bounce around to the most interesting action like the NFL Red Zone channel. If you can't watch the games, the NCAA has a pretty good live scoreboard. And D1baseball.com (probably my favorite college baseball site) has detailed information about all the participants.
Do the underdogs ever win?
Absolutely! College baseball is dominated by the SEC and other regional powerhouses, but that doesn't mean small schools don't have a chance. Fresno State won the 2008 championship as a 4-seed (the last seed in its regional, equivalent to a 13-seed or worse in the basketball tournament). The year before that, Oregon State* won its second consecutive title. And Cal State Fullerton has as many trips to Omaha as LSU (16) and as many titles as Miami (four).
*Oregon State isn't exactly a small school by any stretch. But say "baseball powerhouse" and most people probably won't think "Oregon."
It helps, of course, if you know what your team's probability of winning is. That's why I made this win expectancy chart based on the observed win percentages from the 2014 season for every base/out state, lead, and inning combination.
Isn't scoring way down since they got rid of the crazy aluminum bats?
Yes, in the sense that there aren't any more 21-14 games. But the NCAA actually made a change to their game balls before this season, flattening the ball's seams to make them more like MLB baseballs. A recent study by Washington State's Sports Science Laboratory showed that flat-seamed baseballs traveled significantly farther than the raised-seam balls the NCAA had been using when launched at the same velocity and angle.
The NCAA was quick to trumpet the apparent power surge that followed. Despite a cold February, home runs per game jumped from 0.33 in the first month of 2014 to 0.47 HR per game in February 2015. Earlier this season, we confirmed that offense was up, as measured by runs per game, home run rate, and ISO. But we found no evidence that pitchers were having a harder time controlling the new ball, or that comebacks were more prevalent than in years past, contrary to what others claimed.
If there's so much offense, why do teams bunt and steal all the time?
I know, I know. You're used to the pros, where teams steal at an optimal rate and bunting is verboten. But just because Manny Machado knows what to do when the batter lays one down doesn't mean bunting is stupid across all space and time. If you're at a lower level, where the fielders aren't as skilled, a bunt could be a good call. There's a better chance the infielders throw the ball away or someone forgets to cover a base, and so you end up advancing the runners without actually giving up an out.
We actually looked at stolen base success rates and run expectancies following a bunt earlier this spring. And we did find that teams were caught stealing far too often: with no outs, the break-even success rate for stealing second base was around 70 percent, but the actual success rate in 2014 was just above 60 percent.
But we also found that maybe bunting in college baseball isn't a crime against humanity. It's true that the number of expected runs in an inning doesn't improve after a bunt, but the probability of scoring at least one run does go up, at least in the most common cases. And the percentage of times a hitter reached base on a bunt (think BABIP, but with errors included) was much higher in college than in MLB for every base/out state.
Your analysis is lame. Where can I get the data you used to do my own?
Lucky for you, I've put all the data I have (from 2012-2015) on a GitHub page. You can do whatever you want with it, but I don't make any claims about the accuracy of every individual event. You should also check out Chris Long's GitHub; his database includes a number of other collegiate levels (NJCAA and NAIA, for example), and some good sample code like team power rankings and catcher framing statistics.
What if I want to watch only draft prospects?
Sure, you can do that too. But since I don't know much about prospects, I asked for help from prospect aficionado, Cal League junkie, and BtBS alum Jen Mac Ramos, who will be scouting the Lake Elsinore regional. The combination of a Cal League stadium and metal bats seems like a recipe for lots of offense, but Ramos is still most excited for young arms like UC Santa Barbara sophomores Dillon Tate and Justin Jacome.
"I'm going to be focusing on the pitcher's arsenal mostly, their command, and their velocity," Ramos said. "I'm looking for someone who can throw a fastball with life and decent control, can develop a third or fourth pitch in the low levels of the minors."
But don't get carried away by that lumbering junior who went 3-for-4 with two homers in the first game. Prospect status is relatively stable: a good weekend in March can do a lot to an NBA prospects draft status, but Ramos couldn't remember an equivalent on the diamond.
So who's going to win?
This is a surprisingly difficult question to answer! Not just because baseball is so random as to make a double-elimination tournament a total crapshoot, or because the bracket is a bit convoluted, or because there's anything special about this year's tournament, or because I'm copping out. There are no bracket contests like there are for the basketball tournament; no computer companies are sponsoring big machine learning efforts to predict the winners; Barack Obama isn't going on SportsCenter to put Illinois through to Omaha.
So if you want expert predictions, I'm sorry to say there really aren't any. Feel free to make your own, of course, using the GitHub databases or Ken Massey's rankings comparison page. I've also made a spreadsheet to calculate win probabilities for a regional from individual game WPs. And hey, since it's so hard to find other people's predictions, congratulations: You're probably already in the top 25 nationally.
. . .
Statistics courtesy of NCAA and are freely available in MySQL-compatible format through Bryan's GitHub page. Special thanks to Christopher D. Long and Meredith Wills for their web scraping code, and Jeff Wiser for his feedback.
Bryan Cole is a featured writer for Beyond the Box Score who will be at the Baton Rouge regional in spirit this weekend. You can follow him on Twitter at @Doctor_Bryan.