1

I'm getting this error when trying to import CSVs using this code:

some.df = csv_to_disk.frame(list.files("some/path"))

Error in split_every_nlines(name_in = normalizePath(file, mustWork = TRUE), : Expecting a single string value: [type=character; extent=3].

I got a temporary solution with a for loop that iterated through each of the files and then I rbinded all the disk frames together.

I pulled the code from the ingesting data doc

Cauder
  • 2,157
  • 4
  • 30
  • 69

1 Answers1

2

This seems to be an error triggered by the bigreadr package. I wonder if you have a way to reproduce the chunks.

Or maybe try a different chunk reader,

csv_to_disk.frame(..., chunk_reader ="data.table") 

Also, if all fails (since CSV reading is hard), reading them in a loop then append would work as well.

Perhaps you need to specify to only read CSVs? like

list.files("some/path", pattern=".csv", full.names=TRUE)

Otherwise, it normally works,

library(disk.frame)

tmp = tempdir()

sapply(1:10, function(x) {
  data.table::fwrite(nycflights13::flights, file.path(tmp, sprintf("tmp%s.csv", x)))
})


library(disk.frame)
setup_disk.frame()
some.df = csv_to_disk.frame(list.files(tmp, pattern = "*.csv", full.names = TRUE))

xiaodai
  • 14,889
  • 18
  • 76
  • 140
  • I tried that adding the filter to "*.csv* and it gave me one message that repeated this phrase for each file: `Stage 1 of 2: splitting the file [path]` then it gave me this error: `Error in split_every_nlines(name_in = normalizePath(file, mustWork = TRUE), : Expecting a single string value: [type=character; extent=25].` – Cauder Sep 20 '20 at 05:19
  • are you table to read the files properly using data.table? sounds like bigreadr is not liking your files perhaps due to formatting? do you need to set the separator? The other thing which you can try (which is slower is to use csv_to_disk.frame(..., chunk_read="data.table") for the loop approach u have is fine. – xiaodai Sep 20 '20 at 06:08