0

I have 20 files as .csv; all have the same headers as in the picture.

This is how my data looks like

I want to import them once at the same time. I want the timestamp converted from character format to date and time format.

I used this code for importing all the 20 files, which works fine.

path <- "~/Google Drive/Plumeflowlabs test/Data from Plume 17 Nov 2020/"

files <- list.files(path=path, pattern="*.csv")

for(file in files)
{
  perpos <- which(strsplit(file, "")[[1]]==".")
  assign(
    gsub(" ","",substr(file, 1, perpos-1)),
    read.csv(paste(path,file,sep="")))

}

However, it doesn't contain the function to convert the date.

After that, I want to merge all the 20 files into one data frame by the timestamp.

I need help with that too.

Hala
  • 41
  • 1
  • 8
  • 2
    I'd strongly recommend using a list of data frames instead of `assign`. This will make working with them much easier, including column conversion and merging them. [See my answer here for discussion and examples](https://stackoverflow.com/questions/17499013/how-do-i-make-a-list-of-data-frames/24376207#24376207). – Gregor Thomas Nov 17 '20 at 15:00
  • 1
    I don't know how to do it in base R... Can the solution include the packages [`dplyr`](https://dplyr.tidyverse.org/) and [`readr`](https://readr.tidyverse.org/) (both are a part of [`tidyverse`](https://www.tidyverse.org/))? – Érico Patto Nov 17 '20 at 15:00
  • Thanks, Gregor. I am always bad at creating the loop function. I will try to play with the codes, your answers there has a reach of information that looks very helpful. Today I spend time doing it manually importing the files one by one and changing the date format. It took time, but I had to do today the correlation analysis to see the agreement between at least three of my air pollution devices ( three files out of 20 ) how they work. – Hala Nov 17 '20 at 19:52

1 Answers1

1

Try this approach. As no data was shared I can not test it. Taking into account the sage advice from @GregorThomas it is better to store data in a list like this:

#Code
path <- "~/Google Drive/Plumeflowlabs test/Data from Plume 17 Nov 2020/"
files <- list.files(path=path, pattern="*.csv")
#Function to load and transform date
myfun <- function(x)
{
  df <- read.csv(x,sep="")
  df$timestamp <- as.POSIXct(df$timestamp,format='%d/%m/%Y %H:%M',tz = 'GMT')
  return(df)
}
#Apply
List <- lapply(files,myfun)
#Names
names(List) <- files

The names are assigned in reference to files object. After that you can process them.

Duck
  • 39,058
  • 13
  • 42
  • 84
  • Thanks Duck. It shows me an error. Error in read.table(file = file, header = header, sep = sep, quote = quote, : duplicate 'row.names' are not allowed – Hala Nov 17 '20 at 19:53
  • 1
    @user1894845 Try placing this code in the first line of the function `df <- read.csv(x,sep="",row.names=NULL)`. As no data was shared, I could not see the files! Let me know if that works! – Duck Nov 17 '20 at 19:55
  • Thanks, Duck. It worked well. What I understand is that there will be no data frames. All 20 files have been added as one large list. For me, it's new to do such a process. If I want to merge the files by Timestamp it is a bit confusing. The idea of merging the files is to do a correlation analysis to test if the 20 devices give a good high agreement or not. the correlation will be categorized for each column (PM2.5), (PM10), (NO2). I used to do this with only two and three devices, which is so easy. – Hala Nov 17 '20 at 20:16
  • 1
    @user1894845 After that you can assign names and then use `list2env(YourList,envir = .GlobalEnv)` and you will get the individual dataframes! – Duck Nov 17 '20 at 20:17
  • 1
    @user1894845 This can give you a hint for merging dataframes in a list https://stackoverflow.com/questions/8091303/simultaneously-merge-multiple-data-frames-in-a-list – Duck Nov 17 '20 at 20:19