Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Explaining Jeremy Lin's Early, Surprising Success

Advanced Graphing Techniques Part 3 - Creating a Heat Map in R


It's been a long week and I am going to post the extent I have figured out on creating heatmaps in R.  Mainly just enough to completely confuse and frustrate myself.  I would love if some people from the community would help and chime in on the spots I having issues with.  I am hoping to create a graphic close to the following that Jeremy Greenhouse did at BaseballAnalysts.com:

Hollidaybaycountour-thumb_medium

via baseballanalysts.com

Star-divide

Here are the only 2 steps so far.

  1. Follow steps 2 and 3 from Part 2 of the series to install R if you have not yet.
  2. Here is the code that I am currently using to post in the console to create a heatmap.

                                                                 
mysurface <- read.csv('c:/delete/tmp.csv', header=FALSE)

This is a reference to a file on your computer.  It can be put anywhere.   I have set up a sample one to download that contain some values I just made up.  It needs to be in .csv format to work in R.


mysurface <- as.matrix(mysurface)


filled.contour(x=seq(from=-1.5,to=1.5,length=15),
y=seq(from=1,to=4,length=15),
z=mysurface,
nlevels=25,
axes=TRUE,
#col=hsv(h=seq(from=5/6,to=0,length=25),s=1,v=1),
color.palette=topo.colors
)

These lines are the heart of the map creation.   Here is a reference page for what each of the lines mean.  The two length values need to be set to the same value as the number of columns and rows in the .csv on your computer.

lines(c(.25,.25,-1.25,-1.25,.25),c(1.5,3.5,3.5,1.5,1.5), col=c("red"))

I have no idea why the box is offset this way.  One of several mysteries.  Here is what the final output should look like using the sample data provided:

Rarticle_medium

-----------------------------------------------------------------

That is pretty simple, but it is not best analysis and here is where I need HELP.

  1. It was impossible for me to create the map without using the .csv so far.  My main problem is creating a .sql output that is 15 rows by 15 columns.  If someone knows how to write a query that looks at one area and then moves to the next that would be great.  I have spent way too hours on this and right now I need to take a break.
  2. I am trying to limit the programs needed to be known to R and SQL.  People may have seen some outputs like the following my brother created them, but used PHP to do the stepping through the strike zone.
  3. I have also not be able to set the range values on the right part of the image.  If a new high or low value is introduced into the data, the data value scale changes.  I can't get it to be consistent from player to player.

Sorry for not having a 100% complete process, but I have to step away for a bit.  Hopefully one of our great readers can step in and help.  Thanks in advance.

 

Here the code again to just cut and paste into R:

                                                                 
mysurface <- read.csv('c:/delete/tmp.csv', header=FALSE)
mysurface <- as.matrix(mysurface)


filled.contour(x=seq(from=-1.5,to=1.5,length=15),
y=seq(from=1,to=4,length=15),
z=mysurface,
nlevels=25,
axes=TRUE,
#col=hsv(h=seq(from=5/6,to=0,length=25),s=1,v=1),
color.palette=topo.colors
)
 x <- 10*1:nrow(mysurface)
 y <- 10*1:ncol(mysurface)
 contourLines(x, y, mysurface)
lines(c(.25,.25,-1.25,-1.25,.25),c(1.5,3.5,3.5,1.5,1.5), col=c("red"))

Comment 13 comments  |  0 recs  | 

Do you like this story?

Comments

Display:

Good post Jeff

I’ll be able to comment further later tonight.

by vivaelpujols on Feb 25, 2010 9:03 PM EST reply actions  

We had good momentum with the other 2.

I was just getting stuck and figured others might be able to help out.

- .-. ..- … – / – …. . / .—. .-. - .. . … …

by Jeff Zimmerman on Feb 25, 2010 9:23 PM EST up reply actions  

Re:

Jeff,

I took a look, but I didn’t want to mess around with the .csv file if I didn’t have to, so I was waiting to see if there would be an update. If there isn’t, then I’ll start getting my hands dirty.

I’m very grateful for this series and look forward to more of it. I’ve been able to manipulate a lot of the code I’ve already written in SQL with some of the stuff you’ve taught me with R to make some cool graphs.

For pitchers, the most common one I’ve been playing with is the velocity and horizontal movement graphs.Pitch flight charts have been interesting too. I’ve taken the output from code I’ve wrote and copy and pasted the pitch flights into Harry Pav’s template(excel not R, but I’ve love to change that), and that was worked really well.

For hitters, I’ve tried to mess around with spray charts, but I haven’t yet been satisfied with what I’m seeing. Also, I’ve some very basic ‘contact’ charts that show where the player is making contact or where they are getting their hits from, but I’d much rather do that in heat map form.

Right now I’m trying to figure out what I can learn from http://code.google.com/p/r-pitchfx/ which Harry sent me over twitter. The biggest thing I am trying to do (with both the hit charts and the pitch charts) is make legend for each pitch type (right now they are all the same color) and color code them.

Anyway, I figured you’d be interested to know that I’m loving the series and I look forward to more!

Follow me at http://twitter.com/JDSussman
Remember: baseball guys... baseball...

by JD Sussman on Feb 27, 2010 3:46 PM EST up reply actions  

I am playing around with the functions in R

and I the “box” offset. It is the way you made that data which even looking at it in the spreadsheet, looks like a diamond.

As far as setting up the data in MySQL, the only way I can think of is to make a whole bunch of IF functions then an ORDER By function to set up a matrix that aligns up with homeplate or whatever. But my method would produce an extra column in the query.

rzar.wordpress.com
draysbay.com
raysprospects.com

by RZ on Feb 27, 2010 5:49 PM EST reply actions  

Thanks, let me see what I can find.

- .-. ..- … – / – …. . / .—. .-. - .. . … …

by Jeff Zimmerman on Feb 27, 2010 9:33 PM EST up reply actions  

His code for the filled contour plot is essentially the same as yours

except he doesn’t have a col= and the plot produces the more familiar color scheme in that first pic in your post.

rzar.wordpress.com
draysbay.com
raysprospects.com

by RZ on Feb 27, 2010 10:57 PM EST up reply actions  

I'm really enjoying this series

Couple of things…..
1) How would you go about determining league averages?
2) How do you string values, so you’re looking at multiple pitch types (important for the various flavors of fastballs)?

Thanks again

by marc w on Feb 28, 2010 1:46 AM EST reply actions  

Answer to problem 3

Use the zlim option in filled.contour to set the range. Likewise, can use xlim and ylim for their respective dimensions.

For problems 1 and 2, might be able to use the hist2d function in the ‘gplots’ library.

by jaepho on Mar 2, 2010 5:24 PM EST reply actions  

Well maybe not

It uses heatmaps for data which is somewhat different. However if you want to take a look at other uses for heatmaps take a look

by OsandRoyals on Mar 23, 2010 7:53 PM EDT up reply actions  

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Why be a member?

FanPosts

Community blog posts and discussion.

Recent FanPosts

Img_3830_small
BtBS Fantasy League
Small
Context Neutral Run and RBI projections
Small
Free Agent Compensation
Img_0001_small
Value of Various Plate Approaches
Strike_three2_small
Effect of Foul Area on Strikeouts: AL 1954-68: Erratum
Small
Baseball on a stick
Small
Player Evaluating Statistic
Baseball_small
Rays Outfield: Cheap but Extremely Productive
Small
A new xBABIP
Small
Jack Morris "pitching to the score"

+ New FanPost All FanPosts >

Follow us on Facebook!

Follow us on Twitter!

SaberGraphics

MLB Daily Dish

Get the latest MLB Trade Rumors, Transactions, and News at MLB Daily Dish!


Managing Editor:

Jbopp-kc_small Justin Bopp

Columnists:

Adam_small adarowski

Dme_small Satchel Price

Closeup4_small J-Doug

Carlosicon_small Julian Levine

Billy_and_daddy_4th_of_july_small Bill Petti

Featuring:

Dayton_small Jeff Zimmerman

12475953_small Jacob Peterson

Picture-6_small Chris St. John

Btbpro_small Dave Gershman

229331_10150183361996591_674441590_6760167_6637860_n3_small Lewie Pollis

Img_3830_small David Fung