I need to read a huge 20gb csv file.
I don't need all the rows so I could filter some rows to speed up their loading. For this I read that the read.csv.sql
function from the sqldf
package is ideal for this task. so use the following code:
library(sqldf)
data <- read.csv.sql("20gbfile.csv",
sql = "select * from file where `var` ='x'")
After a while it throws the following error:
Error in connection_import_file(conn@ptr, name, value, sep, eol, skip) :
RS_sqlite_getline could not realloc
In addition: Warning message:
In file.remove(dbname) :
cannot remove file 'C:\Users\cordo\AppData\Local\Temp\Rtmp4EXpEN\file3ac82415bc0', reason 'Permission denied'
I don't know what could be happening, I would appreciate any help. Needless to say, I am open to other types of solutions.