1

I have a big dataframe with a p-values column that loosk like the snippet below

myPvalues<-data.frame(pvalues=c(0.00431279265850473,NA,0.00067818958352233,NA,NA,NA,0.00826354450511943,0.00605467431746949,0.00518801869607421,0.00896893103806155))

I would like to map those values to colors, by using a heat map either already defined or to be created. I understand how to make a plot which colors my data in the proper way, but I don't know how I can add a column to my data frame that save those values for the colors, such that the data frame now has also a column like

myPvalues$Colors<-c("#75F4A1","#FFFFFFFF","#547CB8","#FFFFFFFF","#FFFFFFFF","#FFFFFFFF","#F9A13A","#D6F667","#92FC79","#F58046")

where "white" is for NA values, and the other values are codes for colors (here, I put arbitrary codes but they would need to be colors from a color gradient with extremes defined by min(myPvalues$pvalues) and max(myPvalues$pvalues).

I have been looking around but found no solution that quite addresses this problem. Would anybody give me a pointer?

M--
  • 25,431
  • 8
  • 61
  • 93
lucabo
  • 13
  • 3
  • Does this solve your problem: https://stackoverflow.com/questions/13353213/gradient-of-n-colors-ranging-from-color-1-and-color-2 – Harrison Jones Jun 10 '22 at 14:32

1 Answers1

0

First define your gradient colors. I selected red and green and at random.

library(tidyverse)

colfunc <- colorRampPalette(c('red', 'green'))

Then you need to calculate where your current colors rank. Important to note that colorRampPalette() requires an integer input, so the difference in colors between values won't always look right.

pvalue_forColors <- myPvalues %>%
  drop_na(pvalues) %>% 
  arrange(pvalues) %>%
  mutate(rank = row_number())

myPvalues <- left_join(myPvalues, pvalue_forColors, by = 'pvalues')

Generate your gradient

the_colors <- tibble(rank = 1:max(myPvalues$rank, na.rm = T), color_code = colfunc(max(myPvalues$rank, na.rm = T)))

Final column called color_code with a gradient and white for missing values.

myPvalues %>%
  left_join(the_colors, by = 'rank') %>%
  mutate(color_code = if_else(is.na(pvalues), '#FFFFFFFF', color_code))
Harrison Jones
  • 2,256
  • 5
  • 27
  • 34
  • Harrison: thanks a lot, this does what I want; I could adapt it to my case (I use it to set the background color of several plots to be composed by ggarrange). Perhaps in the last code snipped in 'myPvalues%>%' you meant 'myPvalues<-myPvalues%>%'. Many thanks – lucabo Jun 11 '22 at 14:38
  • using `myPvalues <- myPvalues %>%` as opposed to just `myPvalues %>%` definitely works to assign it back to the same object name. I just didn't bother assigning so you could see the output. If the solution works for you, feel free to Accept the answer. – Harrison Jones Jun 13 '22 at 12:28