Run the same code over a bunch of .csv file and save the results into separate .csv files

Question

I have a question to all of you. I have a bunch of .csv files named (a-part1.csv, a-part2.csv, etc.) I have 216 files in total. I have a code in R written that reads the csv file and then performs a bunch of operations within this file. Is there any way I can create a loop in R that runs the same code over all 216 files? Thank you.

Write a function `myfun` that you can apply to all files. After reading in the data with the code in the dupe above to a list `csv_list`, use `lapply(csv_list, myfun)`. — Rui Barradas, Mar 28 '18 at 18:43
Could you include the code you used to read the csv file(s) in your answer? — De Novo, Mar 28 '18 at 19:02

score 0 · Answer 1 · answered Mar 28 '18 at 19:02

0

The code here will read them in for you (modified to my taste). You just add the <do something> lines and substitute in whatever call will correctly read in and write out your particular files/objects

files <- list.files(pattern="*.csv")

for (i in seq_along(files)){
  assign(files[i], read.csv(files[i]))
  <do something>
  write.csv(files[i], paste0("new_", files[i]), sep = ",")
}

answered Mar 28 '18 at 19:02

De Novo

7,120
1
23
39

Hi Dan, thank you very much. My original code starts with: orig_data <- read.csv('./a-part1.csv', sep=';', header= T). Should I replace in orig_data to files[i]? Everywhere?. For example, if I need to define a new variable, I use orig_data$x <- as.Date(orig_data$DATE, format = "%m/%d/%Y"). Should I then use files[i]$x? – Yelena Mar 28 '18 at 20:05
I am struggling how should be and whether I should create a function prior to the loop – Yelena Mar 28 '18 at 22:18
@Yelena you said your code "performs a bunch of operations within this file". That shouldn't change. If you're trying to solve a problem with that part of your code, then you have another question! See if you can't clearly define it, put together a [minimal, complete, and verifiable example](https://stackoverflow.com/help/mcve) and ask that question :) – De Novo Mar 29 '18 at 01:28
hi Dan thank you for your comment. I have a code all written that should be applied to a-part1.csv. Instead of downloading all those csv files manually, i am wondering if there is a way to do it automatic. All csv files have the same structure. The only difference is IDs. – Yelena Mar 29 '18 at 06:08

score 0 · Accepted Answer · answered Mar 31 '18 at 00:05

It looks like the best way is to create a list of data frames and as Rui Barradas suggested write a function. Here is the code:

 files <- list.files(path = "",pattern="*.txt", full.names = TRUE)
 for (i in seq_along(files)) {
   assign(paste0("DF",i), read.table(files[i], sep=";", header = T))

 }

 dfList <- list(df1=DF1,df2=DF2)

 dfList <- lapply(dfList, function(df) {

df$gender<- "F"
df


})

Run the same code over a bunch of .csv file and save the results into separate .csv files

2 Answers2