clock menu more-arrow no yes mobile

Filed under:

JAVIER: Analyzing hitting prospects by age and level

The second in a three-part series on prospect analysis. This contains an interactive visualization to find what a hitter's statistics in the minor leagues tell us about his probability of success in the majors.

Joe Camporeale-USA TODAY Sports


This is the second post in my reveal of the JAVIER system, this one covering analysis for a prospect based on a single year and/or level. Read the introduction post for a full description of the method and an application to some of the current prospects. This particular visualization will be effective for finding how much a prospect is hurting or helping his status based on his production at a particular age and level.

The following are the definitions for Productive, Average, and Busted hitters. There are a few necessities for a player to qualify for any of the categories. They must be a non-pitcher, have played a majority of their career after 1978, have had an MLB debut prior to 2010, or were in the minor leagues in 2008 and were 25 in 2013. The final criteria find those older players who never made the major leagues.

Productive: At least 1,000 PA in the majors and at least .0275 VORP per PA. The range of these 508 hitters goes from Matt Murton (29.3 VORP in 1,058 PA) to Barry Bonds (1,592.7 VORP in 12,606 PA)

Average: At least 1,000 PA in the majors and between -.025 and .0275 VORP per PA. The range of these 649 hitters goes from Rafael Belliard (-63.3 VORP in 2,524 PA) to Omar Vizquel (296 VORP in 12,013 PA).

Bust: Fewer than 1,000 PA in the majors or less than -.025 VORP per PA. There are 12,627 busts, 11,038 of which never made the major leagues (yeah). The hitter with the most VORP who is labeled a "Bust" is Troy Neel (42.9 VORP in 861 PA), a former first baseman for the Athletics (and child support payment evader).

Percentages by level

This table gives the average productive, average, and bust percentages by level. This will be useful to compare to the results in the visualization.


The percentage of player seasons by players who eventually succeed in the major leagues slowly climbs through the lower levels of the minors, making a big jump at AA and an even larger one at AAA. If a player repeated a level, he is counted twice. This table shows the total percentage of player seasons by the various hitter results.


As you will see when I look into career numbers where the bust percentage is approximately 92%, more eligible player seasons are completed by non-busted hitters. This makes sense, since good hitters are less likely to meet the maximum career minor league plate appearance criterion I set.

However, I did have a problem where not all players had an age associated with them. Due to this, they cannot be included in the age-based filters. Each of the players without an age were considered busts, leading to a distribution more like this:


Finding the z-scores

The visualization requires the input of z-scores. These can be estimated, but I have provided a document for more accurate calculations. Input the year, level, league, and appropriate statistics for the player year in which you are interested.

The second sheet gives a listing of all appropriate league and level designations.Using a different abbreviation (e.g. CLF instead of CAL for the California League) will lead to an error. You may also use multiple years in the same league, you’ll just have to choose one year from which to calculate the z-scores.

Next to zBB, zK, and zISO are inputs for the range you would like to use on each z-score. The default is plus or minus one, but this can be changed if you need a smaller or larger range. This depends on the amount of similar seasons you find in the visualization.


Finally, the fun stuff. Here you will be able to turn a player's PA, AB, 2B, 3B, HR, BB, and K totals into a percentage of success or failure.

Instructions for the visualization

Use the z-scores calculated above to set a range on the zBB, zK, and zISO filters. Then filter on age, where a range of plus or minus one year is a good place to start. Finally, filter on level. The upper left hand corner of the viz gives the outputs you desire, in both total and percent form. These boxes tell you what percentage of players who had a similar season went on to be a productive, average or busted player. If the total amount of similar seasons is fewer than 20 to 30, increase the slider ranges a bit.

The chart at the bottom of the visualization shows where those comparable seasons lie in terms of their zBB-zK and zISO. The size of the bubble gives the amount of MLB PAs and the color gives the total MLB VORP. Hover over each point to find information about that player.

If you are simply interested in finding percentages for a large group of players and don’t need to use the visualization, I have included a spreadsheet to help facilitate that. Make sure you fill in all light blue cells and copy and paste using keyboard shortcuts Ctrl+C/Cmd+C and Ctrl+V/Cmd+V for all cells for each line you have data.

. . .

Statistics courtesy of Baseball Prospectus.

Chris St. John is a writer at Beyond The Box Score. You can follow him on Twitter at @stealofhome.