0

I've seen similar posts on importing CSVs, but no imports in which the file types have similar names based on a list. I'd like to write a loop that goes through the list of names, and imports each CSV.

I have a list of 500 CSV located on a website that I'd like to import into R. Each of the files has a similar URL. For example, 5 of the URL's may look like:

https://www.website.com/datasets/Dog/1234.csv
https://www.website.com/datasets/Cat/1234.csv
https://www.website.com/datasets/Turtle/1234.csv
https://www.website.com/datasets/Bird/1234.csv
https://www.website.com/datasets/Cow/1234.csv

The actual CSV data looks like this:

Date,Open,High,Low,2018-01-25,174.505,174.95,170.53

Where the first four items are the column names.

The only part of the URL that changes is the name of the animal. I have the list of 500 animals in a separate XLS file.

I know how to import a single CSV file into a single dataframe:

Dog<-read.csv('https://www.website.com/datasets/Dog/1234.csv')

But how can I import all of them at once into one CSV file, based on the list of animals in the separate XLS file? I was thinking I could save the list of animals as a separate variable, and then loop through the list but I am stuck there:

List <- as.character( read.csv (https://www.website.com/datasets/AnimalList/1234.csv', stringsAsFactors = FALSE, header = FALSE))

The final dataframe should look something like this for 5 animals, with the columns named as Date, Open, High, Low, and Animal.

Date    Open    High    Low Animal
018-01-17   156 111 196 Dog
018-01-18   133 153 112 Dog
018-01-19   194 182 117 Dog
018-01-20   199 158 109 Dog
018-01-21   137 151 145 Dog
018-01-22   164 192 141 Dog
018-01-23   152 113 128 Dog
018-01-24   125 114 175 Dog
018-01-25   152 132 112 Dog
018-01-26   149 125 139 Dog
018-01-17   118 128 134 Cat
018-01-18   168 136 107 Cat
018-01-19   187 150 185 Cat
018-01-20   122 178 190 Cat
018-01-21   112 186 169 Cat
018-01-22   120 192 189 Cat
018-01-23   134 149 106 Cat
018-01-24   195 172 172 Cat
018-01-25   192 162 113 Cat
018-01-26   198 170 118 Cat
018-01-17   160 188 129 Turtle
018-01-18   100 111 129 Turtle
018-01-19   165 101 145 Turtle
018-01-20   200 130 174 Turtle
018-01-21   130 113 130 Turtle
018-01-22   189 101 169 Turtle
018-01-23   185 146 104 Turtle
018-01-24   126 177 102 Turtle
018-01-25   143 102 167 Turtle
018-01-26   107 168 151 Turtle
018-01-17   193 121 169 Bird
018-01-18   148 134 164 Bird
018-01-19   199 192 106 Bird
018-01-20   138 160 124 Bird
018-01-21   105 140 161 Bird
018-01-22   182 170 185 Bird
018-01-23   119 171 172 Bird
018-01-24   154 115 130 Bird
018-01-25   104 105 158 Bird
018-01-26   100 153 169 Bird
018-01-17   191 192 187 Cow
018-01-18   187 128 107 Cow
018-01-19   198 135 114 Cow
018-01-20   170 110 185 Cow
018-01-21   141 119 112 Cow
018-01-22   173 159 173 Cow
018-01-23   139 186 155 Cow
018-01-24   169 178 172 Cow
018-01-25   101 149 155 Cow
018-01-26   157 178 161 Cow
user6883405
  • 393
  • 3
  • 14

1 Answers1

1

You can use purrr to loop through each file name by using map_df which takes the results from each iteration and row binds them into a data frame.

Prep all names

library(tidyverse)

#all animal names
animal.names <- c("Dog", "Cat", "Turtle", "Bird", "Cow")

Create basic function for map_df

  • create url based on names
  • read in csv via url path
  • add animal name as id column
  • move on to the next animal file

The function:

scrape_csv <- function(animal.names){
  #create urls
  animal.urls <- paste0("https://www.website.com/datasets/",animal.names,"/1234.csv")
  #read in file
  df <- read_csv( as.character(animal.urls) )
  #add animal name as col
  df$Animal <- animal.names
  return(df)
}

map each name to function and store results:

df_results <- animal.names %>% map_df( scrape_csv )
Seth Raithel
  • 296
  • 1
  • 7