0

I have a pretty straight forward question. Sorry if this has already been asked somewhere, but I could not find the answer... I want to check if genenames start with a number, and if they do start with a number, I want to add 'aaa_' to the genename. Therefor I used the following code:

geneName <- "2310067B10Rik"
if (is.numeric(substring(geneName, 1, 1))) {
  geneName <<- paste("aaaa_", geneName, sep="")
}

What I want to get back is aaaa_2310067B10Rik. However, is.numeric returns a FALSE, because the substring gives "2" in quotations as a character. I've also tries to use noquote(), but that didnt work, and as.numeric() around the substring, but then it also applies the if code to genes that don't start with a number. Any suggestions? Thanks!

Jaap
  • 81,064
  • 34
  • 182
  • 193
joffie
  • 205
  • 2
  • 12
  • 3
    explore the result of `is.na(as.numeric(substring(geneName, 1, 1)))` or https://stackoverflow.com/questions/4736/learning-regular-expressions and use `grepl()`, i.e. `grepl("^\\d", geneName)` – jogo Aug 17 '18 at 08:51
  • 5
    `<<-` is dangerous! `library("fortunes"); fortune(174)` read https://www.burns-stat.com/pages/Tutor/R_inferno.pdf Circle 6 – jogo Aug 17 '18 at 08:56

5 Answers5

4

Here is a solution with regex (Learning Regular Expressions ):

geneName <- c("2310067B10Rik", "Z310067B10Rik")
sub("^(\\d)", "aaa_\\1", geneName)

or as PERL-flavoured variant (thx to @snoram):

sub("^(?=\\d)", "aaa_", geneName, perl = TRUE)
jogo
  • 12,469
  • 11
  • 37
  • 42
1

Using the replace() function:

start_nr <- grep("^\\d", geneName)
replace(geneName, start_nr, paste0("aaaa_", geneName[start_nr]))
[1] "aaaa_2310067B10Rik" "foo"                "aaaa_9bar"  

Where:

geneName <- c("2310067B10Rik", "foo", "9bar")
s_baldur
  • 29,441
  • 4
  • 36
  • 69
0
geneName <- c("2310067B10Rik", "foo") 

ifelse(substring(geneName, 1,1) %in% c(0:9), paste0("aaaa_", geneName), geneName)

[1] "aaaa_2310067B10Rik" "foo"  

Or based on above comment, you could replace substring(geneName, 1,1) %in% c(0:9) by grepl("^\\d", geneName)

Lennyy
  • 5,932
  • 2
  • 10
  • 23
  • Your welcome. Please consider to upvote or accept the answer if it was helpful to you. Also see: https://stackoverflow.com/help/someone-answers – Lennyy Aug 17 '18 at 11:24
0

Using regex:

You can first check the first character of your geneName and if it is digit then you can append as follow:

geneName <- "2310067B10Rik"
ifelse(grepl("^[0-9]*$", substring(geneName, 1,1)),paste("aaaa",geneName,sep="_"),)

Output:

[1] "aaaa_2310067B10Rik"
Saurabh Chauhan
  • 3,161
  • 2
  • 19
  • 46
0
 geneName=function(x){
   if( grepl("^[0-9]",x) ){
     as.character(glue::glue('aaaa_{x}'))
   }else{x}
 }
> geneName("2310067B10Rik")
[1] "aaaa_2310067B10Rik"
> geneName("sdsad")
[1] "sdsad"
jyjek
  • 2,627
  • 11
  • 23