Most fans know that there are lots of stats for evaluating hitters. Some are a little misleading, like batting average. It does not tell us how much power the batter has or how often he actually reaches base. RBIs sound good, but they are influenced by how many opportunities a batter gets. Two stats that have a very high correlation to run scoring are on-base percentage (OBP) and slugging percentage (SLG). But are they of equal value? Or is one worth more than the other? Maybe their value in scoring runs is determined by a hitter's batting order position. To see if this was the case, I used some research I posted earlier this year and I think the importance attached to OBP and SLG is affected by a player's lineup slot. In this article I estimate how much that might be.
Last February, I posted an article called Value of OBP and SLG by Lineup Position, Part 2. I found that the run value for both OBP and SLG varied quite a bit with a player's lineup slot. For a team as a whole, OBP usually has a 50%-80% greater impact on scoring than SLG. For the leadoff man, OBP was about 3 times as important as SLG. For the number 3 hitter, OBP was 40% more important. This suggests that when you evaluate a player based on his hitting stats and want to get an idea of how many runs he will help his team score, you need to consider where he bats.
I also looked at other stats because OBP and SLG may be highly correlated. Each lineup slot had 3 variables: walk percentage, hit percentage and extra-base percentage. For walks, hits, and extra-bases, the denominator was plate appearances (PAs). This is a little different than comparing OBP and SLG since OBP has PAs as the denominator and SLG has ABs. Also, by using extra-bases, it is a little like isolated power. SLG is not always as good measure of power because a guy who hits a single drives up his SLG. Isolated power is SLG - AVG, or extra-bases divided by ABs. Of course, here, I am using PAs. The article mentioned above found run values for walk percentage (W%), hit percentage (H%) and extra-base percentage (XBH%).
To see how much difference there is using values based on lineup slots, first I had to find run values at the team level (how this allows to find an individual hitter's contribution is explained below). I used the linear regression technique to estimate the run values of W%, H%, and XBH%. It was on a run per game basis.
(1) R/G = 21.4*BB% + 26.25*H% + 11.78*XBH% - 5.14
Now that I know these values, I then supposed that a given hitter would be added to an average hitting team made up of 9 identical hitters. He replaces one of them. Since the W%, H%, and XBH% will most likely be different for the player in question, those stats will change for the team. Then the values will be plugged in to equation (1) to get runs per game. The years covered by this study were 1959-2004. The average values for this period were:
BB/PA = .088
H/PA = .236
XBH/PA = .123
If I added Rickey Henderson to this team, these values will change, since for him they were .167, .232, and .117. Since he was a leadoff batter, and they normally get about 12.2% of a team's PAs, then once he replaces a hitter on this team, the new team stats will be
BB/PA = .097
H/PA = .236
XBH/PA = .123
There actually was a slight drop-off in the second two stats, but things stilled rounded off the same way. Remember, this is based on a weighted average and Henderson is slightly more than one-ninth of his team's PAs. Plugging those values into equation (1) gives us 4.58 runs per game (an average team would score 4.40 runs per game, so Henderson would add about .18 more a game or about 30 more per season).
But Henderson's main strength was getting on base (notice is very high BB%). This is extra important for the leadoff man. In the article mentioned above, I have run values for W%, H%, and XBH% for each lineup slot. BB% has its highest value for the leadoff man. XB% has a fairly low value for the leadoff man. H% is very high at leadoff, but Henderson was about average at that, so this does not affect him too much. To see how many runs an average team would score, I used the values from the article mentioned. First I had to find the average values for each lineup slot for the W%, H%, and XBH% over the 1959-2004 period. Since each one of these has a run value (including the leadoff slot), I plugged in the average values for W%, H%, and XBH% for lineup slots 2-9 and then Henderson's stats into lineup slot 1 (imagine an equation like equation (1) but with W%, H%, and XBH% 9 times each for each lineup slot). Then I found how many runs per game this team would score. It came out to 4.62. Now that is only .04 higher than the 4.58 found with the team-wide method and it may seem like a small difference, but over the course of a season it is 5.81 more runs. That means about half a win more. If that happened for the whole lineup, you get 4-5 more wins and that is significant.
Now it may not be a surprise that Henderson does better by this method. He had the kind of stats you want in a leadoff man, so evaluating him that way will make him look better. If I evaluated every player this way, they may all look better than they would with the team-wide method since we can expect that managers will bat players in the lineup based on their relative strengths. So I decided to look only at leadoff men. I got a list of the top 30 in games as a leadoff man over the 1959-2004 period from the expert, Herm Krabbenhoft. That way, I evaluate players who have the same job. If they all do 5.81 runs better, then this approach does not help us better evaluate leadoff men.
In the table below I show how many runs an average team would score based on equation (1) or the normal method and by breaking things down by batting order spot (the lineup method, the one that showed Henderson's team scoring 4.58 runs per game). The third column shows the difference. The players are listed in order of how much they were helped by the second method. Castillo came out 6.22 runs better. Each player was evaluated the same two ways Henderson was.
Notice that the range of values for the last column is pretty large. In some cases, a player's team is projected to score even fewer runs if we evaluate things by lineup slot. And the biggest swing in value is over 7 runs a season (between Castillo and Damon-this does not mean that Castillo is a better leadoff man or hitter in general than Damon, just that Castillo is helped by this analysis). I have not incorporated base stealing into this. Maybe I will do that for next week.
So evaluating players based on their lineup slot probably matters. I am not sure how this would be done with more sophisticated stats like batting runs (the linear weights method by Pete Palmer), runs created (the Bill James method), or the Baseruns method (by David Smyth). But if you are going to add a player to your team, and you want to know how many more (or fewer) runs you will score, it might pay to take where they are in the batting order into account.