3

I want to randomly split a data table into n number of outputs; then I want to write.table those outputs for each list. So, in test I want to write a file for each list within test.

library(data.table)

set.seed(100)

dt <- data.table(x=rnorm(1000))

n <- 10 # number of data sets

# randomly splits dt into n number of outputs
test <- split(dt, sample(1:n, nrow(dt), replace=T))

# writing tables for each sublist within test
# write.table(test)
# names <- paste0("output", n, ".txt", sep="")
Connor Murray
  • 313
  • 3
  • 12

2 Answers2

2

You could do :

lapply(seq_along(test), function(x) 
       write.table(test[[x]], file = paste0('output', x, '.txt')))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
2

We can use fwrite as it is a data.table and is much faster

library(data.table)
lapply(names(test), function(nm) fwrite(test[[nm]], paste0("output", nm, ".txt")))

The header 'x' is the column name and if we need some custom formatting, it can be done with cat

lapply(names(test), function(nm) 
      cat(test[[nm]][[1]], file = paste0("output", nm, ".txt"), sep = "\n"))

Or as @chinsoon12 mentioned in the comments, specify col.names = FALSE (by default it is TRUE in fwrite)

lapply(names(test), function(nm) fwrite(test[[nm]],
          paste0("output", nm, ".txt"), col.names = FALSE))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • This may be nitpicky but is there a way to not include the header within each list? – Connor Murray Jan 31 '20 at 01:32
  • 1
    @ConnorMurray. Did you meant the column names not to be included? – akrun Jan 31 '20 at 01:34
  • Output1.txt for instance, the first row is " x ". Is there a way to not have this within each list? – Connor Murray Jan 31 '20 at 01:34
  • 1
    @ConnorMurray It is the column name of the data.table. If you want to store it without the column name, then you may do with `cat` i.e. `lapply(names(test), function(nm) cat(test[[nm]][[1]], file = paste0("output", nm, ".txt"), sep = "\n"))` – akrun Jan 31 '20 at 01:37
  • 2
    you can use `col.names=FALSE` as an arg to `fwrite` – chinsoon12 Jan 31 '20 at 01:38
  • This is all great, is there any way to control the number of samples per list? Say I want to have 100 samples per list? This may be very useful for other people. Thanks – Connor Murray Jan 31 '20 at 01:41
  • 1
    @ConnorMurray You may use `replicate` and specify the `n` with number of samples, not sure what you wanted – akrun Jan 31 '20 at 01:45
  • I was thinking something like: replicate(10, sample(dt, 100, replace= TRUE)) – Connor Murray Jan 31 '20 at 01:59
  • @ConnorMurray. and specify `simplify = FALSE` to return a `list` – akrun Jan 31 '20 at 02:00