0

I am just new to R and I came across this dataset where I wanted to compare between 2 categorical variables using heat maps. There were so many variations in the categorical variables that the heat map's plot and axes appeared so noisy that I am not being able to deduce any output. In the code below I am trying to plot a heat map between Occupation and nature of injury.

Both occupation column and nature of injury column has variations more than 30 so how to deal with these type of variables in order to plot a decent heat map.

For example:

OCCUPATION<-c("Warehouse Bagger",""Maintanence Man","Utility Man","Errand Boy")  20more

NATURE_INJURY <- c("Cut","Sprain","Bruise","Burn","Damaged Lung") 30 more

My code:

library(dplyr)
counting <- count(new_data4, OCCUPATION, NATURE_INJURY)
ggplot(data = counting, mapping = aes(x = OCCUPATION,
                               y = NATURE_INJURY)) +
    geom_tile(mapping = aes(fill = n))
manro
  • 3,529
  • 2
  • 9
  • 22
HU EW
  • 53
  • 1
  • 5
  • 1
    Please [make this question reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) by including a small representative dataset in a plain text format - for example the output from `dput(new_data4)`, if that is not too large. – neilfws Sep 01 '22 at 01:03
  • @neilfws it is huge not being able to copy – HU EW Sep 01 '22 at 01:24
  • Something smaller and representative then. It is difficult to help without seeing some data. – neilfws Sep 01 '22 at 01:26
  • here I added a sample concatenation about what the columns look like – HU EW Sep 01 '22 at 01:51

0 Answers0