I am trying to read a file representing a numeric matrix with 4.5e5 rows and 2e3 columns. First line is the header with ncol+1 words, while each row begins with a row name. In txt format it is around 17G in size. I tried using:
read.table(fname, header=TRUE)
but the operation ate all 64G of RAM available. I assume it loaded it in a wrong structure.
Usually people discuss speed, is there a way to import it so it fits properly? Performance is not a primary issue.
EDIT: I managed to read it with read.table:
colclasses = c("character",rep("numeric",2000))
betas = read.table(beta_fname, header=TRUE, colClasses=colclasses, row.names=1)
But documentation still recommends "scan" for memory usage. What would be the "scan" alternative?