How to import multiple .csv files that contain the same information at once?

Question

I try to import multiple csv files at once, but my csv files have the same exact format (variables), so when i use the code found here, i can not distinguish my datasets.

### the code i used 
temp = list.files(pattern="*.csv", full.names=TRUE)
myfiles = lapply(temp, read_csv,)

This code works fine but i can not distinguish my csv files. Is there anyway to use the same code or maybe another way so i can import multiples csv files but can see the name of the csv file attached to the datasets imported?

# this is an example of my output
 myfiles
[[1]]
# A tibble: 10 x 2
      mm     prob
   <dbl>    <dbl>
 1     0 0.0002  
 2     2 0.000300
 3     3 0.00580 
 4     4 0.007   
 5     5 0.006   
 6     8 0.02    
 7    10 0.032   
 8    12 0.015   
 9    13 0.045   
10    15 0.051   

[[2]]
# A tibble: 10 x 2
      mm    prob
   <dbl>   <dbl>
 1     1 0.002  
 2     2 0.003  
 3     3 0.00580
 4     4 0.007  
 5     5 0.006  
 6     6 0.01   
 7     7 0.03   
 8     8 0.011  
 9     9 0.02   
10    10 0.04   

[[3]]
# A tibble: 11 x 2
      mm   prob
   <dbl>  <dbl>
 1     0 0.0001
 2     4 0.0004
 3     5 0.0005
 4     8 0.007 
 5    10 0.0075
 6    15 0.03  
 7    20 0.042 
 8    23 0.05  
 9    25 0.052 
10    27 0.064 
11    30 0.071 

[[4]]
# A tibble: 10 x 2
      mm     prob
   <dbl>    <dbl>
 1     0 0.0002  
 2     2 0.000300
 3     3 0.00580 
 4     4 0.007   
 5     5 0.006   
 6     8 0.02    
 7    10 0.032   
 8    12 0.015   
 9    13 0.045   
10    15 0.051   

# my  csv files have different name g1_a.csv, g2_b.csv, g3_c.csv ...

The desired output would look something like


 myfiles
[[1]]
# name of the file attached to the dataset
#g1_a
# A tibble: 10 x 2
      mm     prob
   <dbl>    <dbl>
 1     0 0.0002  
 2     2 0.000300
 3     3 0.00580 
 4     4 0.007   
 5     5 0.006   
 6     8 0.02    
 7    10 0.032   
 8    12 0.015   
 9    13 0.045   
10    15 0.051   

[[2]]
#g2_b
# A tibble: 10 x 2
      mm    prob
   <dbl>   <dbl>
 1     1 0.002  
 2     2 0.003  
 3     3 0.00580
 4     4 0.007  
 5     5 0.006  
 6     6 0.01   
 7     7 0.03   
 8     8 0.011  
 9     9 0.02   
10    10 0.04   

[[3]]
#g3_c
# A tibble: 11 x 2
      mm   prob
   <dbl>  <dbl>
 1     0 0.0001
 2     4 0.0004
 3     5 0.0005
 4     8 0.007 
 5    10 0.0075
 6    15 0.03  
 7    20 0.042 
 8    23 0.05  
 9    25 0.052 
10    27 0.064 
11    30 0.071

Thank you in advance for your help.

check this other question, I think it may help: https://stackoverflow.com/questions/65865409/read-multiple-files-but-keep-track-of-which-file-is-which-dataframe-in-r/65865668#65865668 — GuedesBF, Feb 05 '21 at 01:01

score 3 · Answer 1 · answered Feb 05 '21 at 00:58

3

Just add this line at the end of your code:

myfiles <- setNames(myfiles, basename(temp))

answered Feb 05 '21 at 00:58

GordonShumway

1,980
13
19

score 2 · Accepted Answer · edited Feb 05 '21 at 19:16

2

maybe you should try:

filenames = list.files(pattern=".csv", full.names=TRUE)
myfiles = lapply(filenames, read_csv)

# i added this line and it is working
myfiles = setNames(myfiles, basename(filenames))

names(myfiles)<-str_remove(names(myfiles), '.csv')

edited Feb 05 '21 at 19:16

Janet

225
1
6

answered Feb 05 '21 at 01:08

GuedesBF

8,409
5
19
37

I used your approach but i get this error `Error in str_replace(string, pattern, "") : argument "pattern" is missing, with no default` – Janet Feb 05 '21 at 01:18
There was one parenthesis missing. Fixed it – GuedesBF Feb 05 '21 at 01:22
Yes i know this is the weird thing! i copied your coed and it is giving me an error about `str_replace` that you are not using. ?? – Janet Feb 05 '21 at 01:25
Removed my first comment. str_remove() actually implicitly calls str_replace(). Try my updated code. – GuedesBF Feb 05 '21 at 01:27
It is working now but giving me an `NA` as names in the list. I cleared anything and tried again but still seeing `NA`. Any thoughts? – Janet Feb 05 '21 at 01:31
So sorry, I got some of the variable names wrong. I think it is ok now – GuedesBF Feb 05 '21 at 01:34
Actually i added @GordonSumway line of code to yours before using `str_remove` and it is working! – Janet Feb 05 '21 at 01:34
If you believe a question was adequately answered, you can accept the answer. – GuedesBF Feb 05 '21 at 01:37
1

The code is still giving an `NA` as names, but i combined both answers yours and @GordonSumwa's to run the code – Janet Feb 05 '21 at 01:39

score 2 · Answer 3 · answered Feb 05 '21 at 02:46

There is also a package called libr that is designed for this situation exactly. It will load a directory of data sets into a list, with each list item named according to the file name. It is very easy to use. Here is an example:

library(libr)

libname(dat, "<directory>", "csv")

Your datasets will be loaded into the variable named "dat". You can then also load them into the workspace with the following command:

lib_load(dat)

The datasets will be loaded with a two-level syntax, like: dat.g1_a, dat.g2_b, dat.g3_c, etc. so it is easy to reference them.

When you are done, just unload them, and it will clean up the workspace:

lib_unload(dat)

This is really amazing and fast and gives a lot of information about the imported files! Thank you very much @David! — Janet, Feb 05 '21 at 02:54

score 2 · Answer 4 · answered Feb 05 '21 at 04:38

2

You can use sapply with simplfy = FALSE which will give the names to the list directly.

temp = list.files(pattern="*.csv", full.names=TRUE)
result <- sapply(temp, read.csv, simplify = FALSE)

answered Feb 05 '21 at 04:38

Ronak Shah

377,200
20
156
213

How to import multiple .csv files that contain the same information at once?

4 Answers4