0

I have multiple csv files (1 file for each city) having following structure.

An example of two out of many files:

CSV File 1:

City_Name  Pop  PopDen  Lit
MI  45000  280  78
MI  37000  310  89
...
...

CSV File 2:

City_Name  Pop  PopDen  Lit
CH  97000  150  74
CH  67000  220  82
...
...

I want to import these csv files into SQL database as 1 table using R. I am not able to find a function to do it. Could anyone please help.

Desired Output (1 Table in SQL):

City_Name  Pop  PopDen  Lit
MI  45000  280  78
MI  37000  310  89
...
...
CH  97000  150  74
CH  67000  220  82
...
  • If the files are not huge, you can load them all into a single [list of frames](https://stackoverflow.com/a/24376207/3358227), combine them (e.g., `data.table::rbindlist` or `dplyr::bind_rows`), and then insert a single frame into a single table. – r2evans Jul 13 '23 at 22:32
  • File size varies from 16MB to 96MB. – Michael_Brun Jul 13 '23 at 22:35
  • Lacking anything else, I'm inferring that "multiple" means more than you are able (and you've tested) to load into a local R instance. Gotcha. There is no R function that automatically reads a file and inserts it into a SQL database. It can likely be scripted (bash, python, and R, equally well) so long as you can figure out why you are unable to insert data into an existing table. I don't know how to help, it works for me every week (both SQL Server and Postgres). FYI, your other question lists both postgres and SQL Server, which is it? Might not change things much, but still – r2evans Jul 13 '23 at 22:40
  • @r2evans It is postgres. – Michael_Brun Jul 13 '23 at 23:02
  • 1
    Just curious: why do you need R in the middle here. Can't you just import each CSV directly to SQL with the psql \copy command? – Micha Jul 14 '23 at 08:57
  • @Micha because I have 100s of csv files – Michael_Brun Jul 14 '23 at 14:10
  • @Michael_Brun It should be possible to write a loop in a command shell, and go thru all the 100's of files. What OS? What scripting language are you comfortable with? – Micha Jul 15 '23 at 12:20
  • @Micha OS is Windows. I use R, I am a beginner, though.. Could you please help me how to do it. – Michael_Brun Jul 15 '23 at 13:33

2 Answers2

3

I am not so familiar with Windows, but since it's so popular, you shouldn't have any problem finding examples of looping over *.csv files. First prepare a psql statement that uploads the data from one csv. It will be something like:

psql -c '\copy your_table(City_Name,Pop, PopDen, Lit) FROM '/path/to/csv/.txt' WITH (FORMAT CSV)'

Note: you'll have to enter the full path on Windows to the psql utility.

Test that it works for one file, then loop over all CSV files (in cmd.exe window):

FOR %f in (dir /b *.csv) DO 
psql -c '\copy your_table(City_Name,Pop, PopDen, Lit) FROM '%f' WITH (FORMAT CSV)'
DONE

Something along those lines...

Micha
  • 403
  • 2
  • 13
1

Using read.csv() is not a good option to import multiple large CSV files into R Data Frame. In order to read multiple CSV files or all files from a folder in R, use data.table package. data.table is a third-party library, you need to first install it by using install.packages('data.table').

# Use data.table package
  library(data.table)
  df <- 
   list.files(path = "/Users/admin/apps/csv-courses/", pattern = "*.csv") %>% 
   map_df(~fread(.))
df
thagraybush
  • 131
  • 1
  • 6
  • 3
    You might have to load some more packages for this answer to work - `map_df` is not a standard inclusion, as is `%>%` – thelatemail Jul 13 '23 at 23:45
  • @thagraybush I used magrittr (for %>%) and purr (for map function) packages along with data.table. I tried the code but it is given me the error: Error in `map()`: ℹ In index: 1. Caused by error in `fread()`: ! File 'MI_annual.csv' does not exist or is non-readable. getwd()=='C:/Users/Michael/Documents' But if I use only list.files, I can view my file. What could be the reason for this error and how to resolve it. Kindly help. – Michael_Brun Jul 14 '23 at 14:27
  • @Michael_Brun Sorry just seeing this comment, windows paths are a pain in the a**. Try using double quotes, triple set of double quotes, etc. Try changing the path from / to \. There is a character limit on import of windows path names through Windows APIs as well (if your file name is long). You can read more about that here: https://stackoverflow.com/questions/31574761/r-doesnt-see-a-file-that-exists-on-a-disk – thagraybush Aug 13 '23 at 04:53