3

everyone! I am new to R and would like to create a heatmap. There is a data set with columns:

  • X: x coordinate
  • Y: y coordinate
  • Pet_type: type of pet (cat, dog, hamster, etc)
  • Owner_type: type of owner (adult male, adult female, kid)

Small dataset:

Owner Pet X Y
Male Dog 27.793 88.2128
Male Hamster 37.7177 87.9776
Female Cat 24.4547 87.3016
Kid Cat 36.464 84.9169
Kid Dog 29.4175 84.5433
Female Lizard 37.9588 83.9029
Male Guinea pig 44.8986 82.7822
Kid Dog 26.6216 82.0757
Male Hamster 46.2332 81.9817
Male Cat 31.9716 81.7507
Female Cat 22.8606 80.9761
Kid Dog 29.744 80.7988
Kid Lizard 32.2393 80.35
Female Guinea pig 38.92 78.8604
Male Dog 39.42 78.3604
Kid Hamster 32.2632 87.8267

What would be the steps to create a heatmap which shows the ratio of one specific pet vs all pets in that specific bin? For example: I want to create a heatmap of Cats density and if the bin consists of 20 pets and 10 of those are cats - the bin's value is 0,5 or 50%, etc.

I am using ggplot and I got that far that I can see count of total pets in each bin. What manipulations should I do to the table, before feeding it to ggplot?

df %>% 
  ggplot(aes(X, Y))+
  geom_bin_2d(bins=15)

This is how far I got

I am struggling to understand how to create a statement that I want to see Cats vs all pets ratio in all bins.

So yeah, I would really appreciate if someone could help me with this problem (probably an easy one)

divibisan
  • 11,659
  • 11
  • 40
  • 58
matissb
  • 31
  • 2
  • 4
    Welcome to SO! Please consider posting a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – medium-dimensional Nov 09 '22 at 13:40
  • 2
    In addition to try and provide a reproducible example, I think as a general rule the best way to start is to first calculate the values you want to plot (ratio of pets in bins), get the results in tidy format and then worry about plotting. – yoland Nov 09 '22 at 13:56
  • @yoland I added a small dataset sample, what would be the steps needed before ggplot? – matissb Nov 09 '22 at 17:29

1 Answers1

0

There are a number of ways to do 2-D binning. One option is to can get {ggplot2} to make the bins for you and then normalize to the total count and re-plot it. Here you first built the plot using raw count and then pull out the calculated bins using ggplot2::ggplot_build() and do a standard group_by() %>% mutate(fract = x/sum(x). Then you can re-plot.

library(tidyverse)

n <- 5000

d <- tibble(x = rnorm(n),
            y = rnorm(n),
            pet = fct_infreq(sample(
              c("cat", "dog", "fish", "bird"), n, T, prob = c(4, 3, 2, 1)
            )))
p <- d %>%
  ggplot(aes(x, y)) +
  geom_bin_2d(aes(fill = after_stat(count))) +
  facet_wrap( ~ pet)

# original plot of count per animal per bin
p

# get underlying data
e <- ggplot_build(p)$data[[1]]

# normalize and then re-plot
e %>% 
  mutate(pet = fct_recode(PANEL, cat = "1", dog = "2", fish = "3", bird = "4")) %>% 
  group_by(xbin, ybin) %>% 
  mutate(fract = count/sum(count)) %>% 
  ggplot(aes(xmin, ymin)) +
  geom_tile(aes(fill = fract)) +
  facet_wrap(~pet)

Created on 2022-11-09 with reprex v2.0.2

Dan Adams
  • 4,971
  • 9
  • 28