0

I have this script:

dat <- read.csv(file="Task_vs_Files_Whirr2.csv", header=T, sep=",",
                row.names=1) 
(sapply(1:nrow(dat), function(x) {
    if (dat[x,2]==1) { 
         write.csv(dat[ (dat[[2]]==1 ) & (1:nrow(dat) >= x) , ] , 
                   file = paste("fil_", x, ".csv") )
    } else {
         NULL
 }))

but the script return NULL value, as below:

[[1]]
NULL

[[2]]
NULL

[[3]]
NULL

[[4]]
NULL

[[5]]
NULL

[[6]]
NULL

[[7]]
NULL

Here are other details:

> str(dat) 
'data.frame':   7 obs. of  7 variables:
 $ pom.xml.         : int  1 1 0 0 0 1 1
 $ ZooKeeper.java   : int  0 0 0 0 0 1 0
 $ HBase.java       : int  1 1 1 0 1 1 0
 $ Hadoop.java.     : int  0 0 0 0 0 0 1
 $ BasicServer.java.: int  1 0 0 0 0 0 0
 $ Abstract.java.   : int  1 1 0 1 0 1 1
 $ HBaseRegion.java : int  1 0 0 0 0 1 1

> dput(dat)
structure(list(pom.xml. = c(1L, 1L, 0L, 0L, 0L, 1L, 1L), ZooKeeper.java = c(0L, 
0L, 0L, 0L, 0L, 1L, 0L), HBase.java = c(1L, 1L, 1L, 0L, 1L, 1L, 
0L), Hadoop.java. = c(0L, 0L, 0L, 0L, 0L, 0L, 1L), BasicServer.java. = c(1L, 
0L, 0L, 0L, 0L, 0L, 0L), Abstract.java. = c(1L, 1L, 0L, 1L, 0L, 
1L, 1L), HBaseRegion.java = c(1L, 0L, 0L, 0L, 0L, 1L, 1L)), .Names = c("pom.xml.", 
"ZooKeeper.java", "HBase.java", "Hadoop.java.", "BasicServer.java.", 
"Abstract.java.", "HBaseRegion.java"), class = "data.frame", row.names = c("WHIRR-25", 
"WHIRR-28", "WHIRR-55", "WHIRR-61", "WHIRR-76", "WHIRR-87", "WHIRR-92"
))

> sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252    LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
[5] LC_TIME=English_Australia.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
user1676484
  • 197
  • 2
  • 2
  • 11
  • @DWin here is the output – user1676484 Sep 18 '12 at 03:46
  • 1
    `write.csv` will return `NULL` invisibly, so your sapply (hidden loop) function will return NULL whether or not it writes a csv. Do the appropriate files get created? I think a `for` loop here would be far simpler. Either way `which` is a function you should look at! – mnel Sep 18 '12 at 04:18
  • @mnel that is the million dollar question we're all reading and asking. – Tyler Rinker Sep 18 '12 at 04:19
  • @mnel oh, ok. no wonder it return null. however, the files get created. – user1676484 Sep 18 '12 at 04:28

1 Answers1

0

Trying to salvage something.... (see my comment above as to why NULL is being returned)

If you are only interested in those data where dat[,2]==1, you might as well use which to find those rows

interested <- which(dat[,2]==1)

now, do you really want separate files containing the same data, repeated without the header row each time? That is what your code is doing now.

writing a for loop because it is far simpler to understand

row_index <- seq.int(nrow(dat))
for(.row in interested){

  write.csv(dat[(dat[[2]] == 1) & (row_index >=.row),], 
            file = sprintf('fil_%s.csv',.row))
}

With your current data this will create a file fil_6.csv containing the 6th row of your data frame (which is the only row where x[,2]==1.

Whether this script will scale to a bigger data set is unclear, because it is unclear what you want to do.

mnel
  • 113,303
  • 27
  • 265
  • 254
  • what if I want separate file with header row? – user1676484 Sep 18 '12 at 05:20
  • but when I execute the code, it only save row 6. What if I want to rename the file with the row value/name? – user1676484 Sep 18 '12 at 05:27
  • maybe you can refer to this link to read my question...http://stackoverflow.com/questions/12453483/creating-new-table-from-a-big-csv-table – user1676484 Sep 18 '12 at 05:30
  • As your function stood (and without any other information), your code would only have saved row 6 (that is what your subset does with your data). The question you linked to makes very little sense - I can't make out what you want in the subset. – mnel Sep 18 '12 at 05:36
  • what if I want to check for all columns? as for this, it only check column 2. – user1676484 Sep 18 '12 at 05:50
  • Then carefully rephrase the question with what you actually want to do. – mnel Sep 18 '12 at 05:56