I have multiple same format csv files that I need to combine but before that
- Header is not the first row but 4th row. Should I remove first 3 row by skip? Or should I reassign the header?
- I need to add in a column which is the ID of the file (same as file name) before I combine.
- Then I need to extract only 4 columns from a total of 7.
- Sum up numbers under a category.
- Combine all csv files into one.
This is what I have so far where I do Step 1, 3, 4 then only 2 to add in a column then 5, not sure if I should add in the ID column first or not?
files = list.files(pattern = "*.csv", full.names = TRUE)
library("tidyverse")
library("dplyr")
data = data.frame()
for (file in files){
temp <- read.csv(file, skip=3, header = TRUE)
colnames(temp) <- c("Volume", "Unit", "Category", "Surpass Object", "Time", "ID")
temp <- temp [, c("Volume", "Category", "Surpass Object")]
temp <- subset(temp, Category =="Surface")
mutate(id = file)
aggregate(temp$Volume, by=list(Category=temp$Category), FUN=sum)
}
And I got an error:
Error in is.data.frame(.data) :
argument ".data" is missing, with no default
The code is fine if I didn't put in the mutate line so I think the main problem comes from there but any advice will be appreciated.
I am quite new to R and really appreciate all the comments that I can get here.
Thanks in advance!