-1

In the same file path I have many different files which start with the same name. Example "myfile_".

The are csv file.

I have a specific code which I would like to execute for every file.

The output of this code is 5 variables/columns.

Is there any way to read the data of every file insert them in the code and save the results in a dataframe which will have a column which will be the name of the file and the columns of the output of codes?

Reproducable example.

Let's say the following dataframe are the files:

employee <- c('John Doe','Peter Gynn','Jolie Hope')
salary <- c(21000, 23400, 26800)
startdate <- as.Date(c('2010-11-1','2008-3-25','2007-3-14'))

myfile_1 <- data.frame(employee, salary, startdate)

myfile_2 <- read.table(header=TRUE, text="text            salary   
laughter        8.50    
happiness       8.44    
love            8.42    
happy           8.30    
laughed         8.26    
laugh           8.22")

An example of executing code:

sum <- sum(myfile_1$salary)
sub <- sum(myfile_1$salary)/2
addtwo <- sum(myfile_1$salary)+2
subtracttwo <- sum(myfile_1$salary)-2
doubleo <- sum(myfile_1$salary)*2

this are the command I would like to calculate for every file. So that's why I ask to load one file every time.

And as output have a df like this:

filename sum sub addtwo subtracttwo doubleo
myfile_1
myfile_2

and in the other columns the result of every execution

Pozmanski
  • 181
  • 11

2 Answers2

2

I wrote the data.frames you supplied to csv files in a directory called 'input'. You can substitute input for your directory.

list_files <- list.files(path = 'input', pattern = '.csv', full.names = TRUE)

do_code <- function(x) {
  dat <- read.csv(x)
  new_dat <- data.frame(filename = basename(x) %>% gsub(".csv", "", .))
  new_dat$sum <- sum(dat$salary)
  new_dat$sub <- sum(dat$salary) / 2
  new_dat$addtwo <- sum(dat$salary) + 2
  new_dat$subtracttwo <- sum(dat$salary) - 2  
  new_dat$doubleo <- sum(dat$salary) * 2 
  new_dat
}

# using base R
new_dat <- do.call(rbind, lapply(list_files, do_code))

# using purrr package
library(purrr)
new_dat <- map_dfr(list_files, do_code)

#>   filename      sum      sub   addtwo subtracttwo   doubleo
#> 1 myfile_1 71200.00 35600.00 71202.00    71198.00 142400.00
#> 2 myfile_2    50.14    25.07    52.14       48.14    100.28
sebdalgarno
  • 2,929
  • 12
  • 28
0

If you want to read many CSV files at a time and store in a single data frame. Then you can try this.

Set working directory first after that run this code

files = list.files(pattern="*.csv")
# First apply read.csv, then rbind
Data = do.call(rbind, lapply(files, function(x) read.csv(x, stringsAsFactors = F)))
  • Thank you but no I want to read every file like a for loop execute the code I have and same the results into a new df. – Pozmanski Mar 04 '18 at 14:16