0

I have a vector of file paths called dfs, and I want create a dataframe of those files and bind them together into one huge dataframe, so I did something like this :

for (df in dfs){
  clean_df <- bind_rows(as.data.table(read.delim(df, header=T, sep="|")))
  return(clean_df)
} 

but only the last item in the dataframe is being returned. How do I fix this?

Edo
  • 7,567
  • 2
  • 9
  • 19

2 Answers2

0

I'm not sure about your file format, so I'll take common .csv as an example. Replace the a * i part with actually reading all the different files, instead of just generating mockup data.

files = list()
for (i in 1:10) {
  a = read.csv('test.csv', header = FALSE)
  a = a * i
  files[[i]] = a
}


full_frame = data.frame(data.table::rbindlist(files))
Shamis
  • 2,544
  • 10
  • 16
0

The problem is that you can only pass one file at a time to the function read.delim(). So the solution would be to use a function like lapply() to read in each file specified in your df.

Here's an example, and you can find other answers to your question here.

library(tidyverse)

df <- c("file1.txt","file2.txt")
all.files <- lapply(df,function(i){read.delim(i, header=T, sep="|")})
clean_df <- bind_rows(all.files)
(clean_df)

Note that you don't need the function return(), putting the clean_df in parenthesis prompts R to print the variable.

Ava
  • 157
  • 1
  • 7