I am just new to R and I came across this dataset where I wanted to compare between 2 categorical variables using heat maps. There were so many variations in the categorical variables that the heat map's plot and axes appeared so noisy that I am not being able to deduce any output. In the code below I am trying to plot a heat map between Occupation and nature of injury.
Both occupation column and nature of injury column has variations more than 30 so how to deal with these type of variables in order to plot a decent heat map.
For example:
OCCUPATION<-c("Warehouse Bagger",""Maintanence Man","Utility Man","Errand Boy") 20more
NATURE_INJURY <- c("Cut","Sprain","Bruise","Burn","Damaged Lung") 30 more
My code:
library(dplyr)
counting <- count(new_data4, OCCUPATION, NATURE_INJURY)
ggplot(data = counting, mapping = aes(x = OCCUPATION,
y = NATURE_INJURY)) +
geom_tile(mapping = aes(fill = n))