0

I have a dataframe called melted_cormat, and I am trying to use it to plot a heatmap in ggplot. I also have a list of new_names = c(a,b,c) that I would like to make bold on the heatmap. I managed to plot the heatmap, but I have been having issues making bold the list of names that I have. Since the new_names that I want to make bold appear more than once in the Var1 and Var2 columns I am plotting, I used the list of new_names to create additional columns on the melted_cormat namely "names_to_consider" and "names_to_color" to try and use these to specify the names that I want to bold or color. However this is not working, the best I managed to do was make bold everything, or bold one line and skip the other, but not only the three names that I want, eg. this heatmap2 where a, b and c appear in bold. I have included my code below. I have looked up at other pages that helped with the code to specify certain names to bold (e.g Mark certain axis text in bold), the only difference now is that the names appear in a matrix. Thank you in advance for all your help.

new_names = c("a","b","c")

#I tried to narrow down the axis labels to consider, and defined them as those points in the matrix where the Var label is the same as the Var2 label

melted_cormat = melted_cormat %>% 
  mutate(names_to_consider=case_when(Var1==Var2 ~ "consider", Var1!=Var2 ~ "no"))

#among those that I narrowed down, I then used my list of names to create a column "names_to_color", and I managed to enter yes on the three names that I wanted. 

#among consider, change the new names to 'yes', and everything else to 'no'

melted_cormat = melted_cormat %>% 
  mutate(names_to_color=case_when(names_to_consider == "consider" & Var1 %in% new_names ~ "yes", 
                                  names_to_consider == "consider" & !(Var1 %in% new_names) ~ "no",
                                  names_to_consider == "no" ~ "no"))

melted_cormat$names_to_color = as.factor(melted_cormat$names_to_color)
#vec_fontface <- ifelse(levels(melted_cormat$names_to_color[melted_cormat$Var2])=="yes","bold","plain")

#plot
ggplot(data = melted_cormat, aes(x=Var1, y=Var2, fill = value))+
  geom_tile(color = "white")+
  geom_text(data=subset(melted_cormat,value < 200), aes(label = value), size=2) +
  scale_fill_gradient2(low = "white", high = "violet", limit = c(0,maximum), space = "Lab", name="key") +
  theme_classic()+ 
  theme(plot.title = element_text(color="black", size=10, face="bold.italic"),axis.title.x=element_blank(), axis.title.y=element_blank(), axis.text.x = element_text(angle = 45, vjust = 1, size = 8, hjust = 1)) +
  coord_fixed() + 
  ggtitle("myplot") +
  scale_y_discrete(labels=c("a"=expression(bold(`a`)), "b"=expression(bold(`b`)),
                            "c"=expression(bold(`c`)), parse=TRUE))

Macdonald
  • 43
  • 3
  • I think this will be helpful: https://stackoverflow.com/a/39694603/4029270 – Dan Slone Dec 14 '20 at 21:49
  • 1
    Within a `dplyr` pipe on `melted_cormat`, you should never reference `melted_cormat$` again unless you are trying to reset, lose, or somehow otherwise defeat the purpose of the pipe. Just use the variable names, as in `case_when(names_to_consider ...)`. – r2evans Dec 15 '20 at 02:25
  • Thanks @DanSlone the solution you suggested works and I managed to make the labels bold. However this works great if you have one dataset and you put the names to bold directly in the code. I wanted to refer to a file (new_names) as this file will change from time to time and I won't have to go back into the code and change the names to bold. – Macdonald Dec 15 '20 at 14:47
  • thanks @r2evans I won't be using the ```melted_cormat$``` again in the dplyr pipe. – Macdonald Dec 15 '20 at 14:48
  • That doesn't change the fact that using `melted_cormat` within the pipe (like this) is typically wrong. Other reasons when it will fail: if you have `filter` anywhere previous in the pipe; if you do any grouping; if you change any of the variables you are refencing using `melted_cormat$`; if the order of the data changes, whether directly with `arrange` or otherwise. If *any* of those are true, then best-case you will get an error, worst-case (e.g., same number of rows but in different order) is that you will have incorrect results and won't even suspect. – r2evans Dec 15 '20 at 14:52
  • @Macdonald sure you can always make the element_text() refer to a vector that you create based on the data. Then the ggplot call can stay the same while your data changes. Ideally it would be a column in the data frame that was created based on the data condition of interest using case_when() or something like that. – Dan Slone Dec 15 '20 at 16:39
  • hi @DanSlone thank you for the response. I did not understand how to fix that in the element_text(). I have modified the original code on the post with the changes we discussed – Macdonald Dec 15 '20 at 18:13

0 Answers0