1

I saved a disk frame to its output directory and then restarted my R session.

I'd like to read the existing disk frame instead of recreating it elsewhere.

How might I be able to accomplish this? My folder is called outdir.df

This is how I saved the disk frame

  mydf <- csv_to_disk.frame("myfile.csv",
                    in_chunk_size = 1e8,
                    shardby = "col",
                    outdir = "diskframe/outdir.df"
  )

Cauder
  • 2,157
  • 4
  • 30
  • 69
  • 2
    How did you save this disk frame? With `save.image`? `saveRDS`? Something else? Where on disk was your disk frame data stored? If its in an R temporary directory then its gone once you quit R. – Spacedman Sep 11 '20 at 15:50
  • 1
    I added details. Disk frame stores the chunks locally into a file on my machine. In this case , the output directory is called outdir.df – Cauder Sep 11 '20 at 15:54
  • 1
    What about `mydf <- disk.frame("diskframe/outdir.df")`? (Or `"thisisadiskframe.df"` or whatever your path is, you mention two here.) – r2evans Sep 11 '20 at 16:18
  • It should be located in mydf? – Oliver Sep 11 '20 at 17:21
  • 2
    @r2evans, that worked perfectly. It looks like the variable isn't actually loading any data, it's just making a reference to a location for the disk frame chunks. No load time. – Cauder Sep 11 '20 at 17:32
  • 2
    Can you add that as an answer so I can mark it correct? – Cauder Sep 11 '20 at 17:32

1 Answers1

2

I think disk.frame's preferred method is to open a reference to the disk location, using

library(disk.frame)
mydf <- disk.frame("diskframe/outdir.df")

Since it's just a reference and not actually loading all of the data (since the stated intention of disk.frame is to not load all data into memory), this should be nearly instantaneous.

r2evans
  • 141,215
  • 6
  • 77
  • 149