0

I have several files in a folder. They all have same layout and I have extracted the information I want from them.

So now, for each file, I want to write a .csv file and name it after the original input file and add "_output" to it.

However, I don't want to repeat this process manually for each file. I want to loop over them. I looked for help online and found lots of great tips, including many in here.

Here's what I tried:

#Set directory
dir = setwd("D:/FRhData/elb") #set directory
filelist = list.files(dir) #save file names into filelist

myfile = matrix() 
#Read files into R
 for ( i in 1:length(filelist)){
  myfile[i] = readLines(filelist[i]) 

         *code with all calculations*

   write.csv(x = finalDF, file = paste (filename[i] ,"_output. csv")
 }    

Unfortunately, it didn't work out. Here's the error message I get:

Error in as.character(x) : cannot coerce type 'closure' to vector of type 'character'

In addition: Warning message: In myfile[i] <- readLines(filelist[i]) : number of items to replace is not a multiple of replacement length

And 'report2016-03.txt' is the name of the first file the code should be executed on.

Does anyone know what I should do to correct this mistake - or any other possible mistakes you can foresee?

Thanks a lot.

====================================================================== Here's some of the resources I used:

https://www.r-bloggers.com/looping-through-files/

How to iterate over file names in a R script?

Looping through files in R

Loop in R loading files

How to loop through a folder of CSV files in R

Jxson99
  • 347
  • 2
  • 16
  • what is the purpose of the ";" separator for the file names? – cumin Oct 09 '17 at 02:53
  • @cumin, thanks for pointing it out. That's a typo. It's not part of the code. – Jxson99 Oct 09 '17 at 11:25
  • Is there a particular reason why you use `readLines` instead of `read.table` or `read.csv` or (personal recommendation) `data.table::fread`? – statespace Oct 09 '17 at 11:30
  • Hey, @A.Val.. The document the code reads is a .txt report, so there's text and numbers in it, and that's why I went with readLines. – Jxson99 Oct 09 '17 at 11:36
  • Does it mean that data is not structured? I.e. it can not be read in a table format? `*.txt` extension in itself doesn't say anything. `*.csv` is also a text file and from data perspective they tend to be exactly the same thing. – statespace Oct 09 '17 at 11:42
  • @A.Val., that'd be correct: the data is not structured in a table. However, I have already collected the information I need from the report. The problem is that I have to change the name of the file every time. I tried to loop over the names but it didn't work out. – Jxson99 Oct 09 '17 at 11:50
  • You are appending the data to some matrix which requires consistent data (this is why I assumed table format). Try using `list()` instead. I don't see issues with file names (although you could cut off '.txt' from original name before pasting to "_output.csv". Also use `sep = ""` or `paste0` instead. – statespace Oct 09 '17 at 12:02

1 Answers1

0

This worked for me. I used a vector instead of a matrix, took out the readLines() call and used paste0 since there was no separator.

dir = setwd("C:/R_projects") #set directory
filelist = list.files(dir) #save file names into filelist

myfile = vector() 
finalDF <- data.frame(a=3, b=2)
#Read files into R
for ( i in 1:length(filelist)){
  myfile[i] = filelist[i]
  write.csv(x = finalDF, file = paste0(myfile[i] ,"_output.csv"))
} 
list.files(dir)
cumin
  • 471
  • 5
  • 17
  • Hey, @cumin, thanks for your input. But how is the file supposed to be read into R? By doing `myfile[i] = filelist[i]`, `myfile` will simply store a string of characters that is the name of the file. – Jxson99 Oct 09 '17 at 14:34
  • sorry thought the problem was writing a file with a name based on an existing file name. Yes, I see you still need to read the files and create the DFs to write to the file with the transformed name. Can you read any files from disk? – cumin Oct 09 '17 at 16:03
  • What if you used read.csv instead of readLines? I added that to my loop `a = read.csv(filelist[i])` and it read and printed all the files I just made for the previous solution (they had different names but had the same DF). Instead of printing the data in the file, you could process it in some other way before writing the newly processed data to disk. – cumin Oct 09 '17 at 16:15