2

Here is a sample of my data: http://pastebin.com/pCbUnmqQ

I am attempting to make a hexbin plot where each bin is discretely colored by the most common name in that box. Something like the answer here.

used packages ggplot2 and library

I tried to use the code from the above question

ggplot(player, aes(x=X2, y=Y2, z=Name)) + stat_summary_hex(fun = function(z) {
+     tab <- table(z)
+     names(tab)[which.max(tab)]
+ })

but it was a mess with every single player having their own hexbins instead of one hexbin with the most common name being the color.

enter image description here

the code for the plot I want it to look the most like is

> ggplot(player, aes(x=X2, y=Y2)) + stat_binhex()

I want that but with each bin colored corresponding the player who appears there the most.

Community
  • 1
  • 1
Hines
  • 21
  • 2

1 Answers1

0

Is this what you mean, or is this what you don't want?

ggplot(player, aes(x=X2, y=Y z=Name)) + 
  stat_binhex(aes(fill = Name))

enter image description here

Here is the same plot, but with a number of bins added to stat_binhex()

enter image description here

ggplot(passpaste, aes(x=X2, y=Y2)) + 
  stat_binhex(aes(fill = Name), bins = 8
Nancy
  • 3,989
  • 5
  • 31
  • 49
  • that is what I do not want, it is just 3 people but it already looks like the bins are on top of each other. the last players bins to be plotted cover up earlier players bins. it should look something like this [link]http://i.imgur.com/QGelrTg.png[/link] but with colors based on the most common player in that bin, so orange if it's Diekmeier, blue if Adler, etc – Hines Dec 18 '15 at 22:52
  • Hm. Are you sure the bins are right? I looked at the number of unique bins and there are 545 unique X2s, 482 Y2s, and 998 unique combinations of X2 and Y2. There are 999 rows in the data. Given that, how would the bins be big enough to have more than one person? – Nancy Dec 18 '15 at 22:56
  • I didn't post all of the data because it was too large. There are actually 200,000 rows of data and 400 players, not 3. – Hines Dec 18 '15 at 23:00
  • Sure, I get that. I'm just trying to understand how many patients you expect to be per bin and the number of bins you expect to see on your ideal plot. – Nancy Dec 18 '15 at 23:04
  • I want the number of bins to be like the link in my comment, maybe a little less depending on how it looks. there will be lots of patients inside each bin, even if I have a subset of just 15 total patients (which will be the most common usage). but I want the color of each bin to reflect which patient has the most appearances in that bin – Hines Dec 18 '15 at 23:09
  • Right. It seems like this is primarily a data management and aggregation issue rather than a plotting issue. I'm still unclear on how bins are defined (if not by unique X2, Y2 pairs) though. – Nancy Dec 18 '15 at 23:10
  • so how would you suggest the data to be managed or aggregated? this post: [link]http://stackoverflow.com/questions/17371591/using-stat-summary-hex-to-show-most-frequent-value-with-discrete-color-scale[/link] seemed to have very similar data and they were able to do what I wanted to – Hines Dec 18 '15 at 23:15
  • it still has the problem that the layers plotted on afterwards (Adler) are laid on top of the earlier ones (Diekmeier) – Hines Dec 18 '15 at 23:54