I am using the dplyr
package in R for filtering my data of gene expressions. I have calculated fold changes and would like to filter the genes (rows) in which at least one sample (columns) has a value greater than +0.584963 OR less than -0.584963.An example data:
X SAMPLE_1_FC SAMPLE_2_FC SAMPLE_3_FC SAMPLE_4_FC SAMPLE_5_FC
GENE_1 0.6780 0.4050 0.8870 0.3300 0.2230
GENE_2 0.2340 -0.6670 0.0020 0.1240 0.3560
GENE_3 0.0170 0.1560 0.1120 0.0080 -0.1230
GENE_4 -0.0944 -0.1372 -0.1800 -0.2228 -0.2656
GENE_5 -0.8080 -0.7800 -0.5560 0.0340 0.4450
GENE_6 0.2091 0.1106 0.0121 -0.0864 -0.1849
GENE_7 0.5980 0.7680 0.9970 0.4670 -0.7760
I am currently using the following script
det.cols<- colnames(my.data)[which(grepl("fc",tolower(colnames(my.data))))]
filt <- gsub(","," | ",toString(paste("`",det.cols,"`",">abs(0.584963)", sep = "")))
my.datasub<- my.data %>% filter_(filt)
but this returns only the genes greater than +0.584963 and not the negative ones. In the case of the example, what I want is a subsetted list with Genes 1, 2, 5 and 7. But instead it gives me only Genes 1 and 7. How can I change this?
I am expecting the answer to be in this format:
X SAMPLE_1_FC SAMPLE_2_FC SAMPLE_3_FC SAMPLE_4_FC SAMPLE_5_FC
GENE_1 0.6780 0.4050 0.8870 0.3300 0.2230
GENE_2 0.2340 -0.6670 0.0020 0.1240 0.3560
GENE_5 -0.8080 -0.7800 -0.5560 0.0340 0.4450
GENE_7 0.5980 0.7680 0.9970 0.4670 -0.7760
Thanks.