1

I have a character vector containing genes and their associated colors:

gene_colors<-c("protein_coding"="#1F78B4", "lncRNA"="#de08a0") 

I'm trying to go through another list of genes and add the gene with a random color if it's not already in the vector:

library(tidyverse)
library(randomcoloR)

for(gene in other_genes){ 
  if(!(gene %in% names(gene_colors))){
    temp<-paste0(gene, '=', randomColor(1))
  }
}

This is what's in other_genes:

 [1] "IG_C_gene"                          "IG_C_pseudogene"                   
 [3] "IG_J_gene"                          "IG_V_gene"                         
 [5] "IG_V_pseudogene"                    "lncRNA"                            
 [7] "miRNA"                              "misc_RNA"                          
 [9] "Mt_rRNA"                            "polymorphic_pseudogene"            
[11] "processed_pseudogene"               "protein_coding"                    

As you can see, I tried to use paste0() and I previously tried to use str_c() but both of these give me a string like this "IG_C_gene=#ffd4bf". I want to use the gene_colors vector in a heatmap function so I need the equals sign to be separate (ie not inside the quotes like it would be if it were a character in a string) like the entries in gene_colors. Is there any way to do this?

mfeldbauer
  • 132
  • 1
  • 9
  • 1
    what are `other_genes` – akrun Jun 13 '22 at 16:28
  • @akrun the ```other_genes``` vector contains other gene names. I listed the genes it contains below my code snippet – mfeldbauer Jun 13 '22 at 16:30
  • @user438383 definitely! I don't think I'm trying to merge things though...I'm trying to give the genes that aren't already in my colors vector a random color. Is there another way to go through them and assign a color if they're not present? – mfeldbauer Jun 13 '22 at 16:32

3 Answers3

1

We may use ifelse instead of a loop

ifelse(!(other_genes %in% names(gene_colors)),
    paste0('"', other_genes, '"', '="', randomColor(length(other_genes)), '"'), 
other_genes)

Or just by assignment after creating a logical vector

i1 <- !(other_genes %in% names(gene_colors))
other_genes[i1] <- paste0('"', other_genes[i1], '"="', randomColor(sum(i1)), '"')

Or with sprintf

other_genes[i1] <- sprintf('"%s"="%s"', other_genes[i1], randomColor(sum(i1)))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Both of those still give me something like ```"IG_C_gene=#ef5c07"``` and I'm not sure that that will be recognized by the heatmap function. I was hoping to get entries in the vector that looked like this ```"IG_C_gene"="#ef5c07"``` – mfeldbauer Jun 13 '22 at 16:40
  • @mfeldbauer try adding the quotes as in the update – akrun Jun 13 '22 at 16:42
  • Thanks! Yes, that does look like what I was asking for but unfortunately that formatting doesn't work with the function I'm using – mfeldbauer Jun 13 '22 at 17:01
  • @mfeldbauer please consider to provide a small reproducible example so that we can test it. thanks – akrun Jun 13 '22 at 17:02
1

I realized that the function I'm trying to use requires the use of named vectors. Therefore, thanks to the accepted answer here I have found a solution that works by just adding the color to the gene_colors vector with the gene name as its name:

gene_colors<-c("protein_coding"="#1F78B4", "lncRNA"="#de08a0")

for(gene in other_genes){ 
  if(!(gene %in% names(gene_colors))){
    gene_colors[gene]<-randomColor(1)
  }
}
mfeldbauer
  • 132
  • 1
  • 9
1

This can be solved as shown below:

index <-  other_genes[!other_genes %in% names(gene_colors)]
gene_colors[index] <- randomColor(length(index))
gene_colors
        protein_coding                 lncRNA                  IG_C_gene        IG_C_pseudogene 
             "#1F78B4"              "#de08a0"                  "#adc3ea"              "#6962c1" 
             IG_J_gene              IG_V_gene        IG_V_pseudogene                  miRNA               misc_RNA 
             "#f2ab96"              "#86a3e8"              "#2fe07b"              "#b6f5f9"              "#215b82" 
               Mt_rRNA polymorphic_pseudogene   processed_pseudogene 
             "#356ca3"              "#8098ce"              "#44c942" 

Data:

other_genes <- c("IG_C_gene", "IG_C_pseudogene", "IG_J_gene", "IG_V_gene", "IG_V_pseudogene", 
"lncRNA", "miRNA", "misc_RNA", "Mt_rRNA", "polymorphic_pseudogene", 
"processed_pseudogene", "protein_coding")
Onyambu
  • 67,392
  • 3
  • 24
  • 53