2

I am dealing with genomic data and I have columns on nucleotide position and its conservation score (in a dataframe). I have data regarding which range of nucleotide positions are introns and which are exons. I want to create a third column and be able to specify which regions are introns (as "INTRON") and which are exons (as "EXON").

As an example suppose in nucleotide positions 1-70000, I want to specify 10000-10200, 17800-21000, 43000-54000 as introns and remaining as exons in another column (hypothetical data). Is there a way of specifying multiple ranges of values from a column in the ifelse function, as that would more or less solve my problem. Is there a better way of doing it ?

smci
  • 32,567
  • 20
  • 113
  • 146
Anurag Mishra
  • 1,007
  • 6
  • 16
  • 23
  • possible duplicate of [adding a column in R to a dataframe](http://stackoverflow.com/questions/18684964/adding-a-column-in-r-to-a-dataframe) – Ferdinand.kraft Sep 08 '13 at 15:44
  • 1
    I think it is more about dealing with range data than adding a column. That said, the OP should be looking at the BioC package 'iRanges'. Doing a search of SO for : { [r] iRanges } should be very helpful since there are many worked examples. – IRTFM Sep 08 '13 at 15:55

2 Answers2

5

Assuming you have data frame like that:

 d <- data.frame(position=round(runif(100, 1, 70000)))

You can combine logical operators:

 d$status <- ifelse(( d$position >= 1000 & d$position <= 10200) | (d$position >= 17800 & d$position <= 21000) | (d$position >= 43000 & d$position <= 54000), 'INTRON', 'EXON')

or you can use nested ifelse:

d$status <- ifelse(d$position >= 1000 & d$position <= 10200, 'INTRON', felse(d$position >= 17800 & d$position <= 21000, 'INTRON', ifelse(d$position >= 43000 & d$position <= 54000, 'INTRON', 'EXON')))
zero323
  • 322,348
  • 103
  • 959
  • 935
-2

assuming you have a table contain the some information including bmi, you can use this code

x$normal_bmi<- ifelse(x$bmi>=18 & x$bmi<= 25,1, 0)

then

table(x$normal_bmi)

0 1

40 26