How to split an already split datafame and save in multiple folders in R

Question

I have split my data frame to 100 tibbles as below. There are 10 variables in each tibble including class_name. What is the optimal way to create a folder named as each tibble, and re-split each tibble into class_names and save as separated CSV.

I have tried several combinatin of lapply(finction(x), paste0()) but failed.

So assuming that the split dataframe be like this:

MyDF                          Variables
1_A →   tibble with 10 rows class_name  Green, purple, 
        …
2_B …                          type     
3_C                             ..  
..                              .   
100_XX                              .

So the expected output is:

/1_A/   →   1_A.Green.csv
                1_A.purple.csv
                1_A. ….

/2_B/   …       2_B.yellow.csv  
                 …

..      
100_XX

Welcome to SO! Please take time to read [how to create a reproducible example](https://stackoverflow.com/help/minimal-reproducible-example). As it stands, it's difficult to see what you've done and what you need to do. How did you split your original data frame? Did you use `split` so that you have a list of data frames? If so, then the most straightforward approach would be nested for loops. You create you directories and `split` again in the first loop, and then iterate over the resultant list in the second loop, writing each data frame to your directories. — , Oct 22 '19 at 06:51
Thanks dear Gersht, Yes, I simply used split over my raw data. Do you have any simple code for the nested loops you mentioned? — Sean, Oct 22 '19 at 10:39

score 0 · Accepted Answer · answered Oct 22 '19 at 10:50

0

Since you don't supply example data I'll use the dataset datasets::mtcars. You'll need to adapt the solution to fit your data.

Based on your comment I assume you've already split your data using something like the following, which returns a list:

dfs <- split(mtcars, mtcars$vs)

The next step is to iterate over this list using the names of the list elements. Create a directory for each list element name, then split each data frame and iterate over the resultant sublist using the sublist element names, writing each data frame to the appropriate directory with file.path(dn, paste0(fn, ".csv")):

for (dn in names(dfs)){
    dir.create(dn)
    sub_dfs <- split(dfs[[dn]], dfs[[dn]]$gear)
    for (fn in names(sub_dfs)){
        write.csv(sub_dfs[[fn]], file.path(dn, paste0(fn, ".csv")))
    }
}

The above should create two directories "0" and "1", each containing a number of CSVs.

answered Oct 22 '19 at 10:50

How to append the csv to have another sheet in the same files? Lets say want to save same thing in another sheet under another name as “data”.I tried several ways as copy the same line, use write table or write.excel but faced lots of errors. Your help is highly appreciated. – Sean Oct 25 '19 at 01:16
@Sean CSVs don't have sheets, so you are probably looking to write a XLSX (Excel). I like to use `writexl::write_xlsx()`, but there are other options. All you need to do is put your sheets/data frames into a list. Take a look at [this SO answer](https://stackoverflow.com/a/47053853/10191355) to get started. If that doesn't work for you then I would ask a new question. – Oct 25 '19 at 04:43
Hi Gersht, thanks for respond. That’s exactly my question that hoe to ament your example with the write.xlsx instead of csv. – Sean Oct 25 '19 at 06:57

How to split an already split datafame and save in multiple folders in R

1 Answers1