Hype Fails to Live up to Stephen Strasburg
Last night, I had the privilege of attending what was, without a doubt, the most important game in the young history of the Washington Nationals' franchise. Before I get into the greatness that was Stephen Strasburg's major league (or perhaps more like AAAA) debut, let me paint a picture for those who were unable to witness history in person.
For the first time since the Nats opened their new park, the rafters were packed with fans--not Red Sox fans, not Cubs fans, not Phillies fans, not Mets fans, but Nationals fans. For the first time I can remember since the inaugural season at RFK Stadium, Nats fans rose to their feet on two-strike counts. For the first time since Livan Hernandez won the team's first home opener on April 14, 2005, Nationals fans such as myself experienced the unparalleled joy of exceeded expectations.
This is exactly what the national pastime is supposed to feel like in the nation's capital.
No doubt you've already read/heard about the former San Diego State star's performance, but let's put his performance into perspective (the following according to Baseball-Reference.com's Game Index):
- Strasburg's debut tied Max Scherzer's improbable 5.2 IP outing for best strikeout performance of the 2010 season. Both Tim Lincecum and Ubaldo Jiminez topped out at 13 Ks earlier this season, while eight others have whiffed 12 in a single game.
- The Nats phenom broke John Patterson's single-game franchise strikeout record. Patterson twice struck out a baker's dozen, once in 2005 and again in 2006.
- Strasburg's performance against the Pittsburgh Pirates was easily the strongest, K-wise, for a pitcher facing the Bucs in 2010. Jonathan Sanchez and Yovani Gallardo struck out 11 and 10 Pirates back in April, respectively.
- Only two pitchers in history have started their careers with more than 14 strikeouts: JR Richard and Karl Spooner (15). They were both allowed to pitch complete games, and Spooner threw 143 pitches. However, both of those pitchers allowed 3 walks, while Strasburg allowed none. No pitcher has ever thrown 14 Ks and allowed 0 walks in a debut, until now.
The kid's line:
| IP | Pit | Bll | Str | H | R | ER | BB | K | HR | WHIP | ERA | LI | WPA | WPA/LI |
| 7 | 94 | 29 | 65 | 4 | 2 | 2 | 0 | 14 | 1 | 0.57 | 2.57 | 0.91 | 0.171 | 0.188 |
Of course, if you're reading this blog, you're probably interested in detail that you can't see in a box score, or even necessarily a record book. As always, we can satisfy our curiosity by digging deeper, using Pitch FX data to get a better idea of what Strasburg's performance "looked" like. Oh sure, I can tell you that it was a sight to behold, or I can show you exactly how #37 took 15 mph off his fastball to make a hitter look silly (data from Brooks Baseball's Pitch FX Tool).
The chart above plots Strasburg's pitch velocity as he progressed through the evening (Lowess curve superimposed upon the data points), separating pitches by pitch type as identified by MLB's algorithm.* As you can see, the phenom consistently kept his four-seamer's velocity around 98 mph, topping out at 100.1. He frequently complemented his near-triple-digit heater with an 82-83 mph curve that carried enormous break. On top of that, he occasionally proffered a changeup that left his hand at speeds higher than many pitchers' four-seamers.*Note: Several commenters and analysts (such as Tim Kurkjian) have noted that Strasburg throws both a four-seamer and a two-seamer (or what Strasburg calls a 'one-seamer'). This makes sense considering the break on his fastballs. However, MLBAM doesn't yet have enough data (I assume) to separately classify these two pitches, so they both came through as four-seamers. I'm going to rely on MLBAM's estimation for now, since that's where the data came from, but feel free to read everything that is labeled "four-seamer" as just plain "fastball."
But for all the break on his curveball, Strasburg's four-seamer maintains a phenomenal amount of movement as well. See the chart and table below:
| Pitch | # | Mean Velo | Mean H-Break | Mean V-Break | Ball % | Strike % | Whiff % | In Play % | FF% | CU% | CH% |
| All | 94 | 92.7 | 7.4 | 6.9 | 30.9% | 69.1% | 17.0% | 10.6% | 63.8% | 26.60% | 9.57% |
| FF | 60 | 97.5 | 7.2 | 7.2 | 33.3% | 66.7% | 13.3% | 11.7% | |||
| CU | 25 | 82.2 | 7.4 | 8.1 | 28.0% | 72.0% | 20.0% | 8.0% | |||
| CH | 9 | 90.2 | 8.3 | 0.8 | 22.2% | 77.8% | 33.3% | 11.1% |
As you can see, Strasburg's fastball has almost as much negative break as his curveball has positive break, and it darts as much to the first base side as the curveball does to the third. Over the course of the evening, the rookie relied heavily on the four-seamer to start off an at bat, and then began to mix in the curve to keep hitters off balance. Strasburg's pitch selection vs. the count:
| Curveball | Four-Seamer | |||||||||
| s\b | 0 | 1 | 2 | 3 | s\b | 0 | 1 | 2 | 3 | |
| 0 | 16.7% | 12.5% | 0.0% | 0.0% | 0 | 79.2% | 62.5% | 100.0% | 100.0% | |
| 1 | 46.7% | 37.5% | 0.0% | 0.0% | 1 | 46.7% | 37.5% | 100.0% | 100.0% | |
| 2 | 40.0% | 21.4% | 75.0% | 0.0% | 2 | 60.0% | 57.1% | 25.0% | 100.0% | |
Both pitches were effective last night, with high strike and whiff rates. Of course, this might have something to do with the Bucs' lackluster offense. The chart below shows that the Pirates hitters were swinging at just about everything. Hard to say so far if this is because Strasburg is so good or because the 2010 Pirates aren't exactly slamming the ball around. As usual, it's probably a bit of both.
Lackluster opposition aside, Strasburg demonstrated a significant amount of control for an inexperienced pitcher. The location chart below indicates where the Nats' rookie hurler was locating his pitches. The chart on the left represents only fastballs, while the chart on the right represents his curveball and changeup. As you can see, he managed to keep the fastball low and near the corner, while he rarely left a breaking ball up in the zone:
There's also a notable difference between how Strasburg approached lefties and righties. When it came to left-handed hitting, Strasburg kept the ball on the outside corner or outside altogether. Against righties, he simply pounded the strike zone with fiendish accuracy. Note the distribution of pitches below (left-handed batters on the left):
This strategy worked particularly well against right-handed hitting, as Strasburg relied heavily on his curve breaking over the plate. The lefties had similar trouble against the fastball on the outside corner.
| vs. L |
# | Mean Velo | Mean H-Break | Mean V-Break | Ball % | Strike % | Whiff % | In Play % | FF% | CU% | CH% |
| All | 48 | 93.4 | 7.4 | 6.2 | 35.4% | 64.6% | 14.6% | 10.4% | 68.8% | 20.83% | 10.42% |
| FF | 33 | 97.3 | 7.3 | 6.6 | 39.4% | 60.6% | 12.1% | 6.1% | |||
| CU | 10 | 82.2 | 7.0 | 7.9 | 30.0% | 70.0% | 20.0% | 20.0% | |||
| CH | 5 | 90.0 | 8.5 | 0.5 | 20.0% | 80.0% | 40.0% | 20.0% |
| vs. R |
# | Mean Velo | Mean H-Break | Mean V-Break | Ball % | Strike % | Whiff % | In Play % | FF% | CU% | CH% |
| All | 46 | 92.0 | 7.3 | 7.5 | 26.1% | 73.9% | 19.6% | 10.9% | 58.7% | 32.61% | 8.70% |
| FF | 27 | 97.7 | 7.1 | 8.1 | 25.9% | 74.1% | 14.8% | 18.5% | |||
| CU | 15 | 82.2 | 7.6 | 8.2 | 26.7% | 73.3% | 20.0% | 0.0% | |||
| CH | 4 | 90.4 | 8.0 | 1.1 | 25.0% | 75.0% | 50.0% | 0.0% |
Finally, let's exit the realm of generalizability for a moment, and look at one particular pitch. It was Stephen Strasburg's 21st pitch of his career. In the grand scheme of things, it didn't mean much. It was a ball that missed rather badly inside and did not draw a swing. But it was a fastball, and it broke over 8" upward and nearly 8.5" towards the plate...
And it left his hand at 100.1 MPH.
That 21st pitch of his evening and career was a snapshot of everything that this kid is capable of: overwhelming speed with mind-blowing movement. It will be interesting to see how Strasburg fares against a more patient team. He might force more deliberate batters to swing in pitchers' counts by employing the break of his heater and his curve. But the young man also showed a tendency to go to his breaking stuff a bit too often against weak bats such as Delwyn Young's and Jeff Karstens'.
If Pudge Rodriguez can keep him from turning to the off-speed stuff too often, and if the newest star in the District sky can manage to maintain his break and velocity, then Stephen Strasburg will only manage to pull the Nationals into the ranks of the respectability, likely making history along the way.
Like this article? Stay in touch by subscribing to our BtB Newsletter and RSS feed.
75 comments
|
7 recs |
Do you like this story?
Comments
My breakdown
I did a very similar piece last night and thought it might serve as an interesting companion since there are a couple of different pieces of content. Go check it out after reading J-Doug’s: http://sabometrics.com/?p=711
My oh my.
J-Doug, this is some SERIOUSLY high quality analysis.
I really like it and it’s incredible to come here and read it.
Well thanks, that's rather high praise, and I appreciate it.
Blogger and Editor, Rational Pastime Blog
One small error
In your pitching line image he should have 65 strikes instead of 55. Just a heads up.
Thanks
BTW, what tools do you use to put together your chart on your site? I just use Excel, Photoshop and the rather anemic graphing software in Stata 10.
Blogger and Editor, Rational Pastime Blog
Stata 9's graphs were bad.
I hope Stata 10’s are better.
I really quite miss working with the rest of Stata though. Can’t truly appreciate Stata until people start asking you to do that level of regression stuff in Excel and it’s like, “What? Excuse me? No thanks.”
R?
I haven’t really used R very much (more of a Matlab person) but I’ve heard that it has very nice graphical capabilities. Not sure if that’s worth learning a whole new program for though.
It’s also free, which is nice.
R is incredible
I believe they use it on Baseball Analysts. A bit more than I can learn right now though.
Blogger and Editor, Rational Pastime Blog
R is freaking hard to deal with
I just use Excel (or Open Office now). I do the hardcore algorithms in R, but I’m not yet good enough to do graphing like Dave Allen and co.
I looked at R in some depth about 2 years ago.
It’s really powerful, but it’s not easy to just pick up and run with.
As a result, we picked up a license of Matlab which has just amazing help functions and is so easy to jump into and use.
There are also some nice community toolboxes available (Econometrics toolbox) for free.
I'm lucky enough to be an academic
So everywhere I go I have access to these tools for free. Others aren’t so lucky, and really should learn R, since it’s free.
Blogger and Editor, Rational Pastime Blog
Have you ever used XL Stat?
I had a lot of success with that, but I didn’t really want to spend 500 to get the license renewed after the free trial.
I’d say for now some combination of Excel and R is all I really need to do the stuff I do. I agree that learning R is pretty useful.
There's an open source analogous program
http://www.gnu.org/software/octave/index.html
I’ve never used it so I can’t vouch for it
by stevesommer05 on Jun 9, 2010 11:02 PM EDT up reply actions
Yeah, and don't bother using Excel's built-in regression calculations
Its multivariate regression estimates are usually just plain screwed up beyond recognition.
Blogger and Editor, Rational Pastime Blog
Excel isn't a good calculation engine in general.
So I rely on Matlab. Our firm has been moving towards Matlab as the back-end to an Excel front end with a C#.
That way, we can write special functions for Excel (regression functions, iterative estimations, etc) and still put it into Excel for the average business person to enter data, recalculate, and print.
hmmm, what about the data anlaysis toolpak for excel?
I use that for most of my regressions (R is pretty easy to use though, so I should just stick with that).
For a lot of simple regression analysis...
…I stick with gretl. You can also install the Rcmdr package for R to get a convenient GUI front end for R, which helps for discoverabilty (it’s not even necessarily that R is hard to use, it’s that R is such a blank slate that it’s hard to figure out what is possible at times).
You can also
get RExcel which takes Rcmdr and lets you basically control R from Excel. Pushes data to and from, run regressions etc.
by stevesommer05 on Jun 9, 2010 11:04 PM EDT up reply actions
If you're going to use Excel
The data analysis extension is great. I just warn everyone against using the basic, off the shelf regression tools in general. People don’t know just how bad Excel is when it comes to this stuff, so I like to get the word out as much as possible. Really, they should just stop including those basic commands. They’ve been broken since Excel 97, if not earlier.
Blogger and Editor, Rational Pastime Blog
I didn't even know Excel had built in regression tools
The only thing they appear to have is the trend line stuff, and I can’t imagine they would screw up something as simple as that.
There are built in formulas for calculating simple things like coefficients, p values, standard error, R, yadda yadda. The coefficients it calculates in a multivariate analysis haven’t been accurate since they introduced the formulas.
A more in depth analysis of their failures is available in McCullough and Heisler 2008, “On the accuracy of statistical procedures in Microsoft Excel 2007” in Computational Statistics & Data Analysis. Here’s the Google Scholar link if you’re somewhere you can access it: http://linkinghub.elsevier.com/retrieve/pii/S0167947308001606
Blogger and Editor, Rational Pastime Blog
I was curious to see how Strasburg's fastball compared with Ubaldo's.
So I looked at last Monday’s shutout of the Giants. Perhaps somebody else could do a better analysis, but it looks to me like Strasburg has slightly higher average velocity, but Ubaldo gets a little bit more horizontal movement. Their vertical movement is pretty similar. Ubaldo’s pitches had more variation — whether that’s good or bad I don’t know.
I'll be doing a Ubaldo piece in a week or so...
…so I can throw in a Strasburg comparo then.
Blogger and Editor, Rational Pastime Blog
What about the two-seam fastball?
It looked like there were two different distinct movements to his fastballs although both sat around the same speed. Did Strasburg break MLBs pitch recognition algorithm or is it truly just a four seam fastball?
Pitch recognition algorithm
The algorithm is tailored to each pitcher and it needs some data to train itself on before it can really be trusted. His next start it will probably have two and four seam fastball classifications.
MLBAM didn't pick up his "one-seamer"
Which is what Strasburg calls it. I’m going to update the post to clarify this.
Blogger and Editor, Rational Pastime Blog
I’m relatively new to pitch f/x but I watched him pitch a few inning last night and alot of the fastballs looked more of the two seam variety which would be unique since those velocities are typical of 4 seamers. The pitches I saw had a noticable horizontal break which says two seamer to me. 4 seamers are usually pretty straight not that there isn’t any movement (if i am understanding movement as the x and y distance from the ball’s straight line projected path) but there isn’t really any break. Could be that I just saw innings where he threw alot of two seamers or what looked like two seamers.
MLBAM didn't differentiate between his fastballs
I’ll make a note of this.
Blogger and Editor, Rational Pastime Blog
Two Seamers
The Hardball Times article on Strasburg does break down the pitches into two and four-seam fastballs.
http://www.hardballtimes.com/main/blog_article/holy-mother-of-strasburg-with-pitch-f-x/
Some of them seem very borderline to me but I was looking in the same place when trying to determine which were which. I thought the cluster of 8-10 pitches with some separation from the four-seam cluster was probably the two-seam cluster but Nick included a few more.
Yeah, nice job by Nick. (Everyone should feel free to keep the good Strasburg links coming, too.)
He plotted H vs. V break, but I think he also used speed to classify the two-seamers (they were a bit slower.) I’d love to see the same graph, but color-coded by velocity. That would help us pick them out.
And then view the location/swings graphs with the two fastballs separated. I wonder if he used them differently.
Thanks, Sky
I took a look at the h-break an v-break scatter and wasn’t confident going ahead and classifying his 2-seamer on my own. I’ll probably come up with a more consistent system in future analyses of Strasburg.
Blogger and Editor, Rational Pastime Blog
Yeah, he did.
I did my own class for Strasburg’s pitches – I think Nick and I came to the same results:
http://www.baseballprospectus.com/article.php?articleid=11128
If you look at the breakouts J-Doug did for speed and break versus LH and RH hitters, you’ll see the LH hitters saw fastballs with more sink and less speed. That’s because he threw a lot more two-seamers to LH than RH.
Yeah I thought it was strange that he threw more two-seamers to lefties as that's supposed to be a platoon pitch
But I’m pretty sure he’s throwing two different fastballs. The cluster differences are small, but actually very distinct. I ran a K-Means to be sure and it gives me the same results. And Strasburg said in an interview that he throws a “one-seamer” (which moves exactly like a two-seamer but whatever).
There are a handful of pitches where...
…you could call them either two or four seamers. (Or you could be the autoclassifier and just call them all four-seamers.) But there are definitely two distinct pitches there, and as you note Strasburg uses two different grips to throw them.
I'm guessing...
that the auto-classifier will start auto-classifying more correctly for his next start. Or maybe hoping is the more correct term.
Agreed...
I don’t know if MLBAM’s algorithm changes from start to start. It should when it comes to pitchers with hardly any data. Anyone know the answer to this one?
Blogger and Editor, Rational Pastime Blog
I think it all depends when the customized neural nets are rolled out for that pitcher
But they need a bunch of data in order to back-propagate and train the neural net, so it may take a lot more data to get that to work.
by Dan Turkenkopf on Jun 9, 2010 1:54 PM EDT up reply actions
Could you elaborate on what types of K-Means and testing you use?
Viva: I was wondering what dissimilarity measures you use, which variables you tend to focus on, if you rely on a particular type of method to determine initial group centers, etc…
Blogger and Editor, Rational Pastime Blog
I used k-means as well - one of the ones with R, I forget which.
I gave him four centers because I knew he had four pitches. I think that’s a more robust method than estimating centers based upon silhouettes/elbowing/etc., if you have that sort of information available to you.
My input variables were just start_speed, pfx_x and pfx_z. One thing you can do with k-means is “whiten” the variables (divide by the SD) before you run the clustering, so that everything is expressed in the same units. I didn’t do that this time – Strasburg’s pitches were pretty distinct for the most part, with the exception of a few borderline pitches from the two-seam/four-seam clusters.
It makes sense if you think about what k-means is doing.
Essentially you’re trying to minimize the Euclidean distance between any pair of points within a cluster – when I tell k-means to use four centers, it finds the four centers that minimize the nth-dimensional distance between the variables.
So if you have variables with vastly differing scales, it means that some variables are more or less important when it comes to figuring distance between points. If you standardize the variables, all of them are then equally considered when figuring distance.
(Now, it may be that some variables are more important to determining pitch type than others, but there’s no guarantee that those variables are the ones that end up seeming more important when you don’t standardize.)
Mmm!
That makes perfect sense. It seems almost like a mandatory step in getting any useful info from k-means.
Otherwise, the off-axis motion components are measured in mostly single digit inches, and the velocity is from 70 to 90 Mph… so it will be far more important.
And you plotted spin angle vs. velocity, nice.
I do remember hearing folks (you? Mike? Harry?) say that was the easiest/best way to tell pitches apart.
Spin deflection and velocity work most of the time
But for some pitches and some pitchers I find that you need to group based on the vx0 (horizontal acceleration) and vz0 (vertical acceleration). For example, Mariano’s cutter and four seamer are indistinguishable when you graph spin vs. velocity or h-break vs. v-break, but they’re very distinct when it comes to initial drop and break velocities.
Blogger and Editor, Rational Pastime Blog
Which is why
I’d imagine Mo is able to keep both pitches so effective.
Blogger and Editor, Rational Pastime Blog
vx0 is not a good way to tell Mo's cutter and other fastball apart
Because he’s so good at hitting the edges and avoiding the middle of the plate, vx0 (the initial horizontal velocity) will tell you which side of the plate the pitch is headed toward. However, he uses both pitch types to both sides of the plate.
You really have to look at one or more of the spin-related parameters. Graph spin (axis or horizontal deflection) vs. game/date or graph it on a game-by-game basis if necessary and you should see two distinct groups.
If you are going back that far in the data, old Yankee Stadium had a major issue with its PITCHf/x camera setup in the first half of 2008 that will obliterate any ability to look at that data together with the rest of the data in one lump.
Winner, Beyond the Box Score 32 Predictions Contest, 2009
I'd feel worse for you if your Pens didn't take down my Caps last year.
Blogger and Editor, Rational Pastime Blog
Haha, well I guess we can’t have it all.
Such a shame we both got screwed this year by Montreal. That team rode a streak of luck like no one’s business for two series and then collapsed.
Pittsburgh sports all the way
Seriously...
I was pleased when the Habs took down your Pens after they took down the Caps—made me feel a lot less embarrassed. But now I’m stuck with the Flyers, who I’d normally root against, and the Hawks. Unfortunately, the family of one of my best friends has been talking crap about the Caps all year, and they’re from Chicago, so I can’t root for them either.
That said, I’ve always been a hockey expat. I adopted the Nats when I moved down here and Ovechkin started making waves, but I grew up a Rangers and (tear) Whalers fan. GO WHALE!
Blogger and Editor, Rational Pastime Blog
I can appreciate the love for the whalers. That’s sort of your “I’m a real hockey fan” card. Are you into advanced hockey stats at all?
Pittsburgh sports all the way
It's becoming more of an "I'm from Connecticut and it's the one thing that makes me unique from Boston or New York" card
And I might be more susceptible to that. I’m guilty of leaving hockey after the Whale departed until Ovechkin came, and I really wouldn’t consider myself a big a hockey fan as some of my friends. That said, I am interested in some of the developments I’ve seen in hockey analysis. Tom Tango does some good stuff over on his blog. If I could get my hands on the data I might play around with it a bit on my own blog.
Blogger and Editor, Rational Pastime Blog
Where in CT are you from J-Doug?
I’m the same way with the Whale.
by Dan Turkenkopf on Jun 9, 2010 3:12 PM EDT up reply actions
I'm from Brookfield
So yes, definitely.
by Dan Turkenkopf on Jun 9, 2010 3:28 PM EDT up reply actions
No way!
I actually didn’t move to New Milford until 1999. Before then I lived in Brewster, NY. Went to high school at the Wooster School in Danbury.
Was just up in CT this weekend—at Mohegan Sun to see Conan O’Brien.
Blogger and Editor, Rational Pastime Blog
Nice
Went to high school in Brookfield – graduated in ’97.
Haven’t spent much time there since, but I still head back every so often to visit my parents.
by Dan Turkenkopf on Jun 9, 2010 3:34 PM EDT up reply actions
Random Whalers story
I was visiting family I’d never met before in Raleigh, NC a few years back. It happened to be the day after Game 1 of the Stanley Cup the year the Zombie Whalers (Hurricanes) made it. My cousins seemed rather excited about it, and having very little to talk about I tried to have a conversation about it.
That was until they told me that they don’t know much about the game. A direct quote, “We just wait until the horn goes off and then we throw our arms up and cheer.”
Sigh…
Blogger and Editor, Rational Pastime Blog
On the bright side though, hockey fans tend to be the most knowledgeable about their sport compared to other major sports fans.
Pittsburgh sports all the way
The aesthetic complement
For those of you who are interested, a few of the pictures from my Strasburg album at Rational Pastime: http://www.rationalpastime.com/2010/06/stephen-strasburgs-major-league-debut.html
Blogger and Editor, Rational Pastime Blog

by 
































