I am attempting to use readLines
to import a 17.6GB csv file into R. I have tried several approaches discussed here, here, here, and elsewhere and readLines
seems to be the only approach that effectively at least can get the data into R.
The problem is that I am unable to convert the output from readLines
into a data frame which I can use in my analysis. The answers to a related question here are not helping me solve my problem.
Here is my sample data:
write.csv(data.frame(myid=1:10,var=runif(10)),"temp.csv")
dt<-data.frame(myid=1:10,var=runif(10))
dt
myid var
1 1 0.5949020
2 2 0.8515591
3 3 0.8139010
4 4 0.3804234
5 5 0.4923082
6 6 0.9933775
7 7 0.1740895
8 8 0.8342808
9 9 0.3958154
10 10 0.9690561
creating chunks:
file_i <- file("temp.csv","r")
chunk_size <- 100000 # choose the best size for you
x<- readLines(file_in, n=chunk_size)
Opening the output from readLines in R:
View(x)
x
[1] "\"\",\"myid\",\"var\""
[2] "\"1\",1,0.594902001088485"
[3] "\"2\",2,0.851559089729562"
[4] "\"3\",3,0.81390100880526"
[5] "\"4\",4,0.380423351423815"
[6] "\"5\",5,0.492308202432469"
[7] "\"6\",6,0.993377464590594"
[8] "\"7\",7,0.174089450156316"
[9] "\"8\",8,0.834280799608678"
[10] "\"9\",9,0.395815373631194"
[11] "\"10\",10,0.969056134112179"
Thanks in advance for any help