0

fwrite is looping over many .csv files in the working directory but when I write a .parquet it overwrites each time.

I have tried several approaches, basically I am trying to use file name I to keep the .csv file name as shown below without overwriting it.

rm(list = ls())
gc()

# Set up environment #
require("data.table")
require("arrow")

# Set directory to data, define files #
setwd("E:/TransferComplete/07/")

files <- list.files(pattern = "csv")

for (i in files){  setwd("E:/TransferComplete/07/")
loopStart <- Sys.time()
  
  bb <- fread(i,header = TRUE,sep = ",", data.table = FALSE, stringsAsFactors = FALSE,
                select = c("x","y","z"))
  gc()
  
  
  write_parquet(bb,
  'E:/P/i.parquet')
  
  
  loopEnd <- Sys.time()
  loopTime <- round(as.numeric(loopEnd) - as.numeric(loopStart), 0)
}
Phil Dukhov
  • 67,741
  • 15
  • 184
  • 220
JBS
  • 43
  • 4

2 Answers2

1

You were very close in your question. When you're writing the .parquet, you need to separate the i when writing the file or the loop will keep writing a file called i.parquet.

write_parquet(bb,paste0('E:/P/',i,'.parquet'))
neuron
  • 1,949
  • 1
  • 15
  • 30
1

Replace this

write_parquet(bb,
  'E:/P/i.parquet')

to this

write_parquet(bb,paste0('E:/P/',i,'.parquet'))
francisco corvalan
  • 300
  • 1
  • 4
  • 10