FanPost

Advanced Graphing Techniques Part 1 - Installing a Pitch FX Database on a Computer

There was discussion in another FanPost on showing people how to do some more advanced graphing techniques.  I have wanted to create some heat maps myself, so I will start a series on how to do more advanced graphing and calculations.

I will step through the following 3 Parts for now and then go into other subjects.

  1. Setting up a pitch FX dataset and making a query.
  2. Installing R, linking it to the dataset and plotting some points
  3. Creating a heatmap using R

I will keep a general theme going and we can see how using advanced features gives us better results.  We will follow how and where Joe Mauer hits home runs off right handed pitchers in 2009.  Once everyone following the process is done, we will move to the next step.  I am running on a PC, so there maybe different steps for those with Macs.  Vivaelpujols will be following the series and I am pretty sure he has a Mac to help along.

Steps start after the jump

Steps:

  • Goto:  http://www.hardballtimes.com/main/article/databases-for-sabermetricians-part-one/ and complete everything until you get to the "Putting the "Data" in the Database" section.
  • We will use most of the steps in the  "Putting the "Data" in the Database" but instead of downloading the Databank data, download a Pitch F/X database found here.   This dataset is quite large, so I would make sure you have ~10GB free to be safe in the installation process.
  • Once the dataset is installed, queries can than be made.  Try to get the list of pitch locations (x,y data) for HRs hit by Joe Mauer off RHP in 2009.  See if you can use some of Colin's sample codes to get the data.   Here is a sample query I used to get a list of HRs hit off of Randy Johnson in the strike zone:

select a.ab_id, p.px, p.pz,a.des 
,a.pitcher ,a.batter
from atbats a 
join pitches p on p.ab_id = a.ab_id
join games g on g.game_id = a.game_id
where a.pitcher = 116615
and g.date like '2008%'   and p.px >-1.2 and p.px < 1.2
and p.pz >= 1.5 and p.pz <=3.5 
and p.pitch_type ='FA' 
and a.des like '%home%'
group by ab_id

I was thinking of just putting up the Mauer query, but I add data as people as it.  I think it is best to work through this part individually and when you get stuck ask for help.

 

  • Others or myself will help with the query, but each person needs to look around the Pitch FX dataset to see where the data (Pitch location, speed, batter, pitcher, umpire, etc) is located and how to get it with a query.  Creating the Joe Mauer query isn't the only one that we can help people with at this time.  Let us know the query you want to do and your attempt and we will help you through it.  Writing queries will be useful for all datasets, so it is best to figure it out initially.
  • Once everyone has a database setup and can make some basic queries, we will move on to installing R on the computer and linking it to the Pitch FX dataset.

There are no stupid questions here, except "What is Dayton Moore thinking?", so feel to ask anything.

 

select p.px, p.pz 

from atbats a 

join pitches p on p.ab_id = a.ab_id 

join games g on g.game_id = a.game_id 

join players pl on pl.eliasid = a.batter

where pl.last = 'Mauer' and pl.first = 'Joe'

and g.date like '2009%'

and a.event = 'Home Run'

and p.type = 'X'


Trending Discussions

X
Log In Sign Up

forgot?
Log In Sign Up

Please choose a new SB Nation username and password

As part of the new SB Nation launch, prior users will need to choose a permanent username, along with a new password.

Your username will be used to login to SB Nation going forward.

I already have a Vox Media account!

Verify Vox Media account

Please login to your Vox Media account. This account will be linked to your previously existing Eater account.

Please choose a new SB Nation username and password

As part of the new SB Nation launch, prior MT authors will need to choose a new username and password.

Your username will be used to login to SB Nation going forward.

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

By becoming a registered user, you are also agreeing to our Terms and confirming that you have read our Privacy Policy.

Join Beyond the Box Score

You must be a member of Beyond the Box Score to participate.

We have our own Community Guidelines at Beyond the Box Score. You should read them.

Join Beyond the Box Score

You must be a member of Beyond the Box Score to participate.

We have our own Community Guidelines at Beyond the Box Score. You should read them.

Spinner.vc97ec6e

Authenticating

Great!

Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.

tracking_pixel_9351_tracker