clock menu more-arrow no yes mobile

Filed under:

What wins in baseball

What does it take to win in major league baseball? I showed in another article that if you out-OPS your opponents, you will win in the long run (link at end of article). But that study only went back to 1989. I would like to go back to all of baseball history, but OBP and SLG allowed is not available for all years. So I tried something that would be similar. Using the Sean Lahman database, I used a team's differential in HRs, non-HR hits and walks (all per game) to predict winning pct.

Team winning percentage was the dependent variable in a linear regression while these variables were the independent variables. For the years 1901-1998 using all teams, the equation was

PCT = .497 + .156*HR + .036*NONHR + .024*BB

Those variables are all in terms of differentials The r-squared was .797, meaning that 79.7% of the variation in team winning pct is explained by the equation. The standard error was .038. For a 162 game season, that is 6.16 wins. That is a little high and leaves a fair amount of room for other factors to affect winning.

But things change over time and what might lead to victory in one era could be different than other eras. The table below shows the intercept and coefficient values for each variable for each decade.

Period    Intercept    HR    NONHR    BB    Standard Error
1901-09    0.496    0.257    0.036    0.033    7.94 Wins
1910-19    0.494    0.269    0.047    0.023    6.93 Wins
1920-29    0.496    0.159    0.035    0.024    5.92 Wins
1930-39    0.496    0.145    0.035    0.025    6.82 Wins
1940-49    0.496    0.151    0.036    0.024    5.95 Wins
1950-59    0.497    0.173    0.037    0.022    5.60 Wins
1960-69    0.496    0.170    0.041    0.022    5.91 Wins
1970-79    0.498    0.169    0.035    0.022    5.65 Wins
1980-89    0.500    0.144    0.030    0.027    5.64 Wins
1990-98    0.499    0.136    0.029    0.022    5.25 Wins

There is some variation over time. The biggest anomaly seems to be the value of the HR differential in the first two decades. The values are much higher than for the other decades. This may be because it was the deadball era. Maybe every extra HR had more value since they were so rare. The last column is the standard error of the regression over 162 games.

So I decided to look just at the post-1920 period, since that is more relevant to what currently happens in baseball. I also looked at 5-year periods for each team instead of season-by-season data. This helps remove the effect that luck can play in any one season. The regression equation was

PCT = .498 + .166*HR + .039*NONHR + .023*BB

The r-squared was .908 and the standard error was .0186. Over 162 games, that is 3.02 wins, only half of what it was before.

In general, if you can have a large differential in each of these three variables, you are very likely to out-OPS your opponents. So this partly confirms the other study.

But how do you beat the other team in OPS or these three variables? You can do it with better hitting, better pitching, or better fielding. Or some combination of the three. Notice that there are no variables for stealing, sacrificing, or productive outs. Those "small ball" things may matter, but not much. What wins is just being able to get hits, hit for power and get runners on base (and stop the other team from doing so). Baseball is a basic, meat and potatoes game. Throw good pitches, hit the ball and catch the ball.