Substitute list of strings with second list, according to first two numeric digits in each

Question

I have a named list of strings:

str(nr.genes)
 Named chr [1:37] "0" "0" "Pipas_c034_0013" "Pipas_chr1-4_0040" ...
 - attr(*, "names")= chr [1:37] "Nr11" "Nr12" "Nr131" "Nr132" ...

I want to substitute the string names for those in the next object, but not 1-to-1 :

str(nr.genes.names)
 chr [1:16] "up.p33-dw.p33" "up.p33-dw.p38" "up.p33-dw.p52" ...

I would like to have something like this:

 str(nr.genes)
 Named chr [1:37] "0" "0" "Pipas_c034_0013" "Pipas_chr1-4_0040" ...
 - attr(*, "names")= chr [1:37] "up.p33-dw.p33" "up.p33-dw.p38" "up.p33-dw.p52" "up.p33-dw.p52" ...

Where I have changed manually

"Nr11"<-"up.p33-dw.p33"
"Nr12"<-"up.p33-dw.p38"
"Nr131"<-"up.p33-dw.p52"
"Nr132"<-"up.p33-dw.p52" #Note that the third digit doesn't matter
       .
       . 
       .

I need to replace each string's name depending on the first 2 digits, with the elements of the second list. I have tried with an lapply function, but I didn't manage, I also tried creating a function involving sub and replacement, but I couldn't see any change, here is the code below:

lfun <- function(x,y) {
  for(o in 1:16){ #Length of the names to substitue for
    for (p in 1:4){ 
      for (q in 1:4){
       x[sub(paste("Nr", p, q, sep=""), x[o], replacement=y[o])]
      }
    }
  }
}

With this function I try to get all the names of the attributes of the first argument and substitute them by the element it implies. But it returns the same I had previously without any warning or error message. I'm sure there must be a easy way to do it, but I don't find the correct

Edit: nr.genes can be created as following

nr.genes <- c("0", "0", "Pipas_c034_0013", "Pipas_chr1-4_0040")
attr(nr.genes, "names") <-c("Nr11", "Nr12", "Nr131", "Nr132")

and nr.genes. names by

nr.genes.names <- c("up.p33-dw.p33", "up.p33-dw.p38", "up.p33-dw.p52")

About the matching I need that Nr11 <-"up.p33-dw.p33", but Nr21<-"<nr.genes.names[5]>", then Nr22<-"<nr.genes.names[6]>",then Nr23<-"<nr.genes.names[7]>",then Nr24<-"",then Nr31 <-"<nr.genes.names[9]>"

Hi and welcome to stackoverflow! You are much more likely to receive a helpful answer if you provide a [minimal, reproducible data set](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610). Can you also please clarify the 'rules' for matching `nr.genes` with `nr.genes.names`. Thanks! — Henrik, Oct 25 '13 at 10:24
You don't answer to Henrik answer here ! your matching rule is not clear. Nr11 <-"up.p33-dw.p33", but Nr21<-"" ? I don't see any relation here! — agstudy, Oct 25 '13 at 11:55
This numbers come from a comparison between two set of object, each had 4 data (thats why the Nr goes from 11, 12, 13, 14, 21, 22, 23, 24, 31, 32, 33, 34, 41, 42, 43, 44), that was compare between them. To make this comparison I did a loop storing each result in this object name Nr++, but to get the nr.genes using `unlist(mget(nr.list))` (nr.list is where I stored all the Nr objects) that splits and add this 3rd number to the names attribute. So I want to make clear from which comparison it comes this gene. — llrs, Oct 25 '13 at 12:35
It's clearest to refer to "named list of string", not "object of the following structure" and "attributes". Then "replace strings from 1st list with 2nd list, according to their first two numeric digits" — smci, Apr 28 '18 at 01:46

agstudy · Answer 1 · 2013-10-25T12:11:21.867

Using gsub for example, you can extract the second number from nr.genes names, and use it as an index in nr.genes.names.

new <- c("up.p33-dw.p33","up.p33-dw.p38", "up.p33-dw.p52")
old <- c("Nr11" ,"Nr12", "Nr131", "Nr132" )

old <- new[as.integer(gsub('^Nr[0-9]([0-9]).*','\\1',old))]

[1] "up.p33-dw.p33" "up.p33-dw.p38" "up.p33-dw.p52" "up.p33-dw.p52"

EDIT

Your matching rule is not clear, maybe you don't have a rule but just 16 items to match. In this case you can use merge to match between genes and their names.

For example, you can create a data.frame where you list the extact matching for each gene, like the dat.match data.frame below. Here I use letters[1:16] to symbolize the 16 nr.genes.names.

dd <- expand.grid(1:4,1:4)
dat.match <- data.frame(nr = paste0('Nr',mapply(paste0,dd[,1],dd[,2])),
                       gene =letters[seq_len(16)])

   nr gene
1  Nr11    a
2  Nr21    b
3  Nr31    c
4  Nr41    d
5  Nr12    e
6  Nr22    f
7  Nr32    g
8  Nr42    h
9  Nr13    i
10 Nr23    j
11 Nr33    k
12 Nr43    l
13 Nr14    m
14 Nr24    n
15 Nr34    o
16 Nr44    p

Then you can use merge like this:

nr.genes <-c("Nr11", "Nr12", "Nr131", "Nr132")
genes <- data.frame(nr=gsub('(^Nr[0-9]{2}).*','\\1',nr.genes))
merge(genes,dat.match)

  nr gene
1 Nr11    a
2 Nr12    e
3 Nr13    i
4 Nr13    i

It works well but just for the first 8 ones, not for all the 16 names I must substitute in the 37 length nr.genes. But thanks, I will try to find the way with `gsub`. — llrs, Oct 25 '13 at 11:14
I tried this way but, the grid in my case is greater without a pattern: here it is: `attr(nr.genes, "names") [1] "Nr11" "Nr12" "Nr131" "Nr132" "Nr133" "Nr134" "Nr141" "Nr142" "Nr21" [10] "Nr22" "Nr231" "Nr232" "Nr233" "Nr234" "Nr235" "Nr236" "Nr237" "Nr24" [19] "Nr311" "Nr312" "Nr321" "Nr322" "Nr323" "Nr33" "Nr341" "Nr342" "Nr41" [28] "Nr421" "Nr422" "Nr431" "Nr432" "Nr433" "Nr434" "Nr435" "Nr436" "Nr437" [37] "Nr44"` — llrs, Oct 25 '13 at 13:22
When you create dd with expand.grid is 16 rows, but mine has 37 with the above names — llrs, Oct 28 '13 at 09:50

Substitute list of strings with second list, according to first two numeric digits in each

1 Answers1