0

I am new to R and am getting these warning messages. Although it seems to work, I am still worried about it. From row-to-row, I want to search within a column of one or more characters for one or more characters in another column. And then I do the reverse. The reason I do the reverse is that one column may have more characters than the other column that I am searching in. If the condition is true (the grepl search finds a match) then I create a two new columns and if it is not true then a second but similar ifelse statement is run. An example of four relevant columns all in same row of data could be

EAF= TGTGTGT; OA= AC; REF= A; ALT = AGTGTG,ATGTGT

Please see below:

    ifelse((grepl(data$EA, data$REF) || grepl(data$REF, data$EA)) || (grepl(data$OA, data$ALT) || grepl(data$ALT, data$OA)) ,
            {data$new_gwas_a1[data$EAF>0.4 & data$EAF<0.6] <- data$REF[data$EAF>0.4 & data$EAF<0.6];
             data$new_gwas_a2[data$EAF>0.4 & data$EAF<0.6] <- data$ALT[data$EAF>0.4 & data$EAF<0.6]},

    ifelse(grepl(data$OA, data$REF) || grepl(data$REF, data$OA) || (grepl(data$EA, data$ALT) || grepl(data$ALT, data$EA)),
             {data$new_gwas_a2[data$EAF>0.4 & data$EAF<0.6] <- data$REF[data$EAF>0.4 & data$EAF<0.6];
              data$new_gwas_a1[data$EAF>0.4 & data$EAF<0.6] <- data$ALT[data$EAF>0.4 & data$EAF<0.6]}, ))

[1] "AGTGTG,ATGTGT" Warning messages: 1: In grepl(data$EA, data$REF) : argument 'pattern' has length > 1 and only the first element will be used 2: In grepl(data$REF, data$EA) : argument 'pattern' has length > 1 and only the first element will be used 3: In grepl(data$OA, data$ALT) : argument 'pattern' has length > 1 and only the first element will be used 4: In grepl(data$ALT, data$OA) : argument 'pattern' has length > 1 and only the first element will be used 5: In grepl(data$OA, data$REF) : argument 'pattern' has length > 1 and only the first element will be used

Any help would be much appreciated!!


Thank you all for the replies! Here is a simple example that is similar to the above. e and f are the objectives and it seems to work but when I run the code as an executable R script within Bash terminal it does the same but also gives warning messages.

 > a <- c("TGTGTGT") 
 > b <- c("AC") 
 > c <- c("A") 
 > d <- c("AGTGTG,ATGTGT") 

 > ifelse((grepl(a, c) || grepl(c, a)) || (grepl(b, d) || grepl(d, 
 b)) , {e <- c; f <- d}, ifelse(grepl(b, c) || grepl(c, b) || 
 (grepl(a, d) || grepl(d, a)), {f <- c; e <- d}, ))

> e
[1] "AGTGTG,ATGTGT"
> f
[1] "A"
oguz ismail
  • 1
  • 16
  • 47
  • 69
Spencer K
  • 73
  • 5
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. You seem to be passing an entire column of data (`data$REF`) to the pattern parameter of `grepl` but it can only take one at a time. – MrFlick Jul 09 '21 at 21:52
  • The pattern argument is not vectorized. You may need to use `mapply`. – IRTFM Jul 09 '21 at 22:35
  • Can you provide a sample input data and the corresponding output to it? – Ronak Shah Jul 10 '21 at 03:02
  • Thank you all. My edit to include the simple example is above. I will try and look into mapply as well. Thanks IRTFM – Spencer K Jul 10 '21 at 15:38
  • Hello IRTFM, If you have time, can you use the mapply as an example. I created a simplified example above. I am still a bit lost. – Spencer K Jul 11 '21 at 18:49

0 Answers0