0

I have a huge dataset with multiple columns representing the expression of a given gene in a specific tissue. The dataset is as follows with 74,440 genes:

Data sample

I am trying to look for genes that have a very specific expression pattern and hence I am using multiple && operators in the if command in the code below. So essentially I am trying to find the genes that match the exact values I put in the if command. Any genes that match this criteria will be appended into a new file which is called Ovaries_only.

The data sample provided is exactly how the excel file looks and there are a lot more rows than showed. I am trying to see if the value in the 18th column exceeds 100 while the rest of the columns are lower than a set value. Based on this I am trying to append the entire row that fits the criteria into a new file.

As I run the code, I keep getting the error ""Error in if missing value where TRUE/FALSE needed" and also multiple warnings "In if (col.names) d[[2L]] else NULL : the condition has length > 1 and only the first element will be used"

Could you tell me what I am doing wrong.


counter = TRUE
i = 1

while (counter == TRUE) {

 if( (data[i,3] < 10) && (data[i,4] < 10) && (data[i,5] < 10) && (data[i,6] < 5) && (data[i,7] < 5) && 
    (data[i,8] < 5) && (data[i,9] < 5) && (data[i,10] < 5) && (data[i,11] < 5) && (data[i,12] < 5) && 
    (data[i,13] < 5) && (data[i,14] < 5) && (data[i,15] < 5) && (data[i,16] < 5) && (data[i,17] < 5) && (data[i,18] > 100) && (data[i,19] < 50) ){

   df = data[i, ]

   write.table(df, "ovaries_only.csv", sep = ",", row.names = FALSE, col.names = !file.exists("ovaries_only.csv", append = TRUE))

   i = i + 1

   if (i == 74440){
       counter == FALSE
     }

 }  

 else{
   i = i+1
   if (i == 74440){
     counter == FALSE
   }
 }

}
S.Chereddy
  • 29
  • 5
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Jan 24 '20 at 17:01
  • 1
    You only need a single `&` for logical comparisons. What happens if you change all the `&&` to `&`? – Allan Cameron Jan 24 '20 at 17:02
  • 1
    I think the problem is with `!file.exists("ovaries_only.csv", append = TRUE)`. That `append=TRUE` should not be in the `file.exists`, that should be outside the parenthesis in the `write.table` I presume. – MrFlick Jan 24 '20 at 17:03
  • @MrFlick That actually helped me get rid of the warnings. – S.Chereddy Jan 24 '20 at 17:15
  • @Allan Cameron I tried changing && to & but the error still exists "Error in if ((data[i, 3] < 5) & (data[i, 4] < 5) & (data[i, 5] < 5) & : missing value where TRUE/FALSE needed". However, I am getting an output file that does match the criteria I have set, I am not sure what the error is doing at this point. – S.Chereddy Jan 24 '20 at 17:16

0 Answers0