Error while merging multiple CSV files on `bind_rows` also adding some redundant columns

Question

I am merging all my CSV files to a single file using the below.

path = "C:\\Users\\User\\Documents\\data\\Reports\\2013"

combined2013 <- list.files(path=path,
                              pattern="*.csv",
                              full.names = TRUE) %>% 
  lapply(read_csv) %>% 
  bind_rows

combined2013["Year"] <- 2013

write.csv(combined2013,"C:\\Users\\User\\Documents\\data\\Reports\\2013\\Reports_2013.csv", 
          row.names = FALSE, 
          encoding = "utf-8")

However, this code worked earlier but now it is giving me an error as:

Error in `bind_rows()`:
! Can't combine `updated_at` <character> and `updated_at` <datetime<UTC>>.
Backtrace:
 1. ... %>% bind_rows
 2. dplyr::bind_rows(.)
 5. vctrs::vec_rbind(!!!dots, .names_to = .id)

I have tried:

combined2013 %>% mutate(across(where(is.character), as.datetime))

But still doesn't work. Also, it adds some REDUNDANT columns during the merge:

Warning: One or more parsing issues, see `problems()` for details
Rows: 2222 Columns: 30
-- Column specification -----------------------------------------------------
Delimiter: ","
chr  (21): id, title, text, video_timestamp, user_pseudo_id, created_at, ...
dbl   (1): sentiment
lgl   (7): video_id, collab_space_id, collab_space_title, deleted, closed...
dttm  (1): updated_at

i Use `spec()` to retrieve the full column specification for this data.
i Specify the column types or set `show_col_types = FALSE` to quiet this message.
New names:
* `` -> `...29`
* `` -> `...30`

I am using LibreOffice to preserve some UTF-8 encoding as I have some German text which was not handled before during merge.

It sounds like `read_csv` is parsing some instances of `updated_at` as character and some as datetime, presumably because its heuristics aren't working consistently with your data. For instance, a date string like "04/05/22" could be ambiguous about whether it's "%m/%d/%y" like in the US or "%d/%m/%y" like in some european countries, whereas "04/31/22" is unambiguous and might be converted to a date. To fix this, manually specify the format you want, or read it in as character to convert downstream. — Jon Spring, Jun 14 '22 at 19:35
Your redundant columns are probably coming in because in some files the 29th and 30th columns have some data with no header. — Jon Spring, Jun 14 '22 at 19:36
[See here](https://stackoverflow.com/q/5963269/5325862) on making a reproducible example that is easier for folks to help with, including a sample of data — camille, Jun 14 '22 at 19:37

Error while merging multiple CSV files on `bind_rows` also adding some redundant columns

0 Answers0