I have 4 CSV files:
One table with a header beginning on row 1 (iris.csv)

And 3 tables with headers beginning on rows 3, 1, and 5 (sales_1, sales_2, sales_3)

As long as I know the first column names of each table, I can use the smart_csv_reader function to determine where each header begins, and read each CSV file at the correct row number:
first_columns <- c('sepal.length', 'month', 'month', 'month')
smart_csv_reader <- function(directory) {
header_begins <- NULL
file_names <- list.files(directory, pattern=".csv$")
for(i in 1:length(file_names)) {
path <- paste(directory, file_names[i], sep='', col='')
lines_read <- readLines(path, warn=F)
header_begins[i] <- grep(first_columns[i], lines_read)
}
print('headers detected on rows:')
print(header_begins)
l <- list()
for(i in 1:length(header_begins)) {
path <- paste(directory, file_names[i], sep='', col='')
l[i] <- list(read.csv(path, skip=header_begins[i]-1))
}
return(l)
}
Just pass in the directory where all your CSVs are.
Usage:
smart_csv_reader('some_csvs/')
[1] "headers detected on rows:"
[1] 1 3 1 5
As you can see the function returns the correct row numbers for each table. It also returns a list of each table read correctly:
