0

I would like to import multiple CSV files using dplr into one tibble, but would also like to capture the file name of each file as a new field as part of the data import.

Let's say file01.txt has this data

F1, F2
A, 1
B, 2
C, 3

and file02.txt has this data

F1, F2
D, 4
E, 5
F, 6

and file03.txt has this data

F1, F2
H, 7
I, 8
J, 9

I can make a vector with the file names:

Fnms <- c("file01.txt", "file02.txt", "file03.txt" )

... and read the first file and add the first file name in a new field.

tmp <- read_csv(Fnms[1]) 
tmp$Fnm <- Fnms[1]

To read the other two files and append their names I tried a loop structure and messy if-else statements:

for (i in 2:3){

  tmp <- tmp %>% 
         bind_rows(read_csv(Fnms[i])) %>% 
         mutate(Fnm = ifelse(row_number() > 3 & row_number() <= 6, Fnms[i], Fnm )) %>% 
         mutate(Fnm = ifelse(row_number() > 6 , Fnms[i], Fnm ))
}

Which produces this incorrect result (also not sure why the first mutate is not skipped when i=3?)

> tmp
# A tibble: 9 × 3
  F1       F2 Fnm       
  <chr> <dbl> <chr>     
1 A         1 File01.txt
2 B         2 File01.txt
3 C         3 File01.txt
4 D         4 File03.txt
5 E         5 File03.txt
6 F         6 File03.txt
7 G         7 File03.txt
8 H         8 File03.txt
9 I         9 File03.txt

I have a solution that works in base R using a loop and rbind, but I'm keen to find a dplyr solution as I'm still learning this newer version of the R language!

Markm0705
  • 1,340
  • 1
  • 13
  • 31
  • 1
    A few options here: https://stackoverflow.com/questions/65000995/how-can-i-read-multiple-csv-files-into-r-at-once-and-know-which-file-the-data-is – Stewart Macdonald Feb 07 '23 at 09:26
  • 5
    `read_csv()` can read multiple indentically formatted files and add an identifier, so you can just do `read_csv(Fnms, id = "Fnm")`. – Ritchie Sacramento Feb 07 '23 at 09:28
  • Ritchie that's the sort of solution I was looking for ! – Markm0705 Feb 07 '23 at 10:04
  • I have re-opened the question since none of the answers at https://stackoverflow.com/questions/65000995/how-can-i-read-multiple-csv-files-into-r-at-once-and-know-which-file-the-data-is provide a dplyr solution which is what the question asked for. – G. Grothendieck Feb 07 '23 at 11:18

1 Answers1

0

The question asked for a dplyr solution so try this:

library(dplyr) # version 1.1.0 or later

File <- c("file01.csv", "file02.csv")
data.frame(File) %>%
  reframe(read.csv(File), .by = File)

giving:

        File F1 F2
1 file01.csv  A  1
2 file01.csv  B  2
3 file01.csv  C  3
4 file02.csv  D  4
5 file02.csv  E  5
6 file02.csv  F  6
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341