0

I have many dataframe to open and assume it in folder named 'CNV'

Ss I read this link [1]: Opening all files in a folder, and applying a function but I'm still confuse.

Assume I have file name a, b ,c ,d ,e .... z in folder CNV in desktop

In these file have same column

Sample, start, stop, event, probe, length
qd1234,  2666,  2888,  CN gain,  23,  235    
cc234,   1000,   1500,  CN loss,  5,  500

My question is how to open all files in one time and after open all of these file

I want to select CN gain which have probe number more than 5, CN loss probe number more than 3, and the probe/length less than 30

The expected result :

a
Sample, start, stop, event, probe, length, length/probe
qd1234,  2666,  2888,  CN gain,  23,  235, 9 
qd1534,  1200,  1800,  CN loss,  60,  600  10
b
Sample, start, stop, event, probe, length, length/probe
qd234,  2666,  2888,  CN gain,  23,  235, 9 
qd534,  1200,  1800,  CN loss,  60,  600  10
Community
  • 1
  • 1
Vayami
  • 11
  • 3
  • Please do not use html `
    `. And do learn to use the SO formatting conventions.
    – IRTFM Aug 09 '13 at 06:06
  • Where exactly are you having trouble? Opening the files or subsetting the dataframes? – Thomas Aug 09 '13 at 06:07
  • I want to open the all file in one time and after open these file, I want to select data with the condition above. sorry about use
    – Vayami Aug 09 '13 at 07:15

1 Answers1

1

I think you might be looking for something like this. When I create crit I look at the columns of the just imported data.frame and see which rows satisfy the specified criterion. Based on that, I subset the full data.frame and return (function returns the last line) the subset.

my.files <- list.files()

my.df <- sapply(my.files, function(x) {
  read.in <- read.table(x, header = TRUE, sep = ",")
  crit <- with(read.in, which(probe > 5 & probe > 3 & (probe/length) < 30))
  read.in[crit, ]
}, simplify = FALSE)

Since we don't have a reproducible example, I'm demonstrating below how this subsetting actually works.

set.seed(357)
xy <- data.frame(x = 1:10, y = runif(10), z = rnorm(10))
xy # we expect row 6 to satisfy all the conditions
xy[with(xy, which(x > 5 & y < 0.5 & z < 0)), ]
Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197