2

I have a yearly stock data in a folder for the last 15 years containing 15 files(one file / year). This folder is also set as my working directory. I can read each file seperately and save it to a variable but i want to make a loop or function to read all the files and create a variable for each year. I have tried with the following code but I can not get the desired results. any Help?

reading each file seperately:

allData_2000 <- read.csv("......../Data_1999-2015/scrip_high_low_year_2000.txt",sep = ",", header = TRUE, stringsAsFactors = FALSE)

allData_2001 <- read.csv("......../Data_1999-2015/scrip_high_low_year_2000.txt",sep = ",", header = TRUE, stringsAsFactors = FALSE)

But i would like to read all the files using a loop:

path <- "....Data_1999-2015"
files <- list.files(path=path, pattern="*.txt")

for(file in files)
{
        perpos <- which(strsplit(file, "")[[1]]==".")
        assign(
                gsub(" ","",substr(file, 1, perpos-1)), 
                read.csv(paste(path,file,sep=",",header = TRUE, stringsAsFactors = FALSE)))
}
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
Archie
  • 33
  • 5
  • I get the error "object files not found". when i check it with list.files(path), it shows me all the files. but doing it with your method i get the above error message – Archie Dec 25 '15 at 22:07
  • try `list.files(path=path, pattern="*.txt", full.names = TRUE)` – Rentrop Dec 25 '15 at 22:16
  • Ah, I see your error(s). First, note that your paste paste command is including all the `read.csv` commands. Then note that your path and file list don't contain a separating slash. So you're getting something like: `......../Data_1999-2015scrip_high_low_year_2000.txt`, which is not a file, of course. Try replacing the whole `read.csv` part with `read.csv(paste0(path, '/', file),sep=",",header = TRUE, stringsAsFactors = FALSE)`. – goodtimeslim Dec 25 '15 at 22:32
  • @Archie are all your files starting with : "scrip_high_low_year_" ? – CuriousBeing Dec 25 '15 at 22:45
  • 2
    It would be better to create a list of data frames, rather than creating individual objects with related names. It's easier to do, and more flexible. – Matthew Lundberg Dec 26 '15 at 00:07
  • @MAxPD- Yes all the file in the holder start with "scrip_high_low_year_".txt. – Archie Dec 26 '15 at 09:17

2 Answers2

2

Try this improved code:

 library(tools)
library(data.table)

files<-list.files(pattern="*.csv")
for (f in 1:length(files))
assign(paste("AllData_",gsub("[^0-9]","",file_path_sans_ext(files[[f]])),sep=""), fread(files[f]))
CuriousBeing
  • 1,592
  • 14
  • 34
-1

Try something like this, maybe.

df_list = list()
counter = 1
for(file in files){
  temp_df = read.csv(paste0(path, '/', file), header=T, stringsAsFactors = F)
  temp_df$year = gsub('[^0-9]', '', file)
  df_list[[counter]] = temp_df
  counter = counter + 1
}
big_df = do.call(rbind, df_list)

create an empty list, then iterate through the files, reading them in. Remove any non-numeric characters in the file to get the year (this is based off what your files look like above: some text, along with the year; if the files don't look like that, you'll need a different method than the gsub I did), and create that as a new variable, and then store the whole dataframe in a list. Then bind the dataframes into a single dataframe at the end.

Edit: upon a reread of your question, I'm not sure if what I told you do is what you want to do. If you just want to load up all the dataframes into memory, and give them a variable so that you can access them, without putting them into a single dataframe, I'd probably do something like this:

df_list = list()
for(file in files){
  temp_df = read.csv(paste0(path, '/', file), header=T, stringsAsFactors = F)
  year = gsub('[^0-9]', '', file)
  df_list[[year]] = temp_df
}

Then each dataframe can be accessed like: df_list[['2000']] would be the dataframe for the year 2000.

goodtimeslim
  • 880
  • 7
  • 13