3

I'm building a bulk-file-uploader. Multiple files are uploaded in individual requests, and my UI provides progress and success/fail. Then, once all files are complete, a final request processes/finalizes them. For this to work, I need to create many temporary files that live longer than a single request. Of course I also need to guarantee filenames are unique across app instances.

Normally I would use Tempfile for easy unique filenames, but in this case it won't work because the files need to stick around until another request comes in to further process them. Tempfile auto-unlinks files when they're closed and garbage collected.

An earlier question here suggests using Dir::Tmpname.make_tmpname but this seems to be undocumented and I don't see how it is thread/multiprocess safe. Is it guaranteed to be so?

In c I would open the file O_EXCL which will fail if the file exists. I could then keep trying until I successfully get a handle on a file with a truly unique name. But ruby's File.open doesn't seem to have an "exclusive" option of any kind. If the file I'm opening already exists, I have to either append to it, open for writing at the end, or empty it.

Is there a "right" way to do this in ruby?

I have worked out a method that I think is safe, but is seems overly complex:

# make a unique filename
time = Time.now
filename = "#{time.to_i}-#{sprintf('%06d', time.usec)}"

# make tempfiles (this is gauranteed to find a unique creatable name)
data_file = Tempfile.new(["upload", ".data"], UPLOAD_BASE)

# but the file will be deleted automatically, which we don't want, so now link it in a stable location
count = 1
loop do
   begin
      # File.link will raise an exception if the destination path exists
      File.link(data_file.path, File.join(UPLOAD_BASE, "#{filename}-#{count}.data"))
      # so here we know we created a file successfully and nobody else will take it
      break
   rescue Errno::EEXIST
      count += 1
   end
end

# now unlink the original tempfiles (they're still writeable until they're closed)
data_file.unlink

# ... write to data_file and close it ...

NOTE: This won't work on Windows. Not a problem for me, but reader beware.

In my testing this works reliably. But again, is there a more straightforward way?

Community
  • 1
  • 1
gwcoffey
  • 5,551
  • 1
  • 18
  • 20
  • possible duplicate of [How do open a file for writing only if it doesn't already exist in ruby](http://stackoverflow.com/questions/5226166/how-do-open-a-file-for-writing-only-if-it-doesnt-already-exist-in-ruby) – Brad Werth Aug 17 '14 at 02:59

2 Answers2

3

I would use SecureRandom.

Maybe something like:

p SecureRandom.uuid #=> "2d931510-d99f-494a-8c67-87feb05e1594"

or

p SecureRandom.hex #=> "eb693ec8252cd630102fd0d0fb7c3485"

You can specify the length, and count on an almost impossibly small chance of collision.

Community
  • 1
  • 1
Brad Werth
  • 17,411
  • 10
  • 63
  • 88
  • I thought about using `uuid` but it does not exist in 1.8.7 which, alas, is what this site is on right now. I guess I could use hex but for probably pedantic reasons I prefer a guaranteed safe approach. I freely accept that my disk is infinitely more likely to fail than is SecureRandom likely to duplicate across a few thousand filenames. So up vote for you :) – gwcoffey Aug 17 '14 at 02:44
  • 1
    Right on. Maybe you could tack it on to your existing method, for some cheap insurance. Also, depending on filesystem constraints, you may be able to make it hundreds of characters long. Buy a lottery ticket as you make your commit, and you should be good to go... – Brad Werth Aug 17 '14 at 02:53
1

I actually found the answer after some digging. Of course the obvious approach is to see what Tempfile itself does. I just assumed it was native code, but it is not. The source for 1.8.7 can be found here for instance.

As you can see, Tempfile uses an apparently undocumented file mode of File::EXCL. So my code can be simplified substantially:

# make a unique filename
time = Time.now
filename = "#{time.to_i}-#{sprintf('%06d', time.usec)}"

data_file = nil
count = 1
loop do
   begin
      data_file = File.open(File.join(UPLOAD_BASE, "#{filename}-#{count}.data"), File::RDWR|File::CREAT|File::EXCL)
      break
   rescue Errno::EEXIST
      count += 1
   end
end

# ... write to data_file and close it ...

UPDATE And now I see that this is covered in a prior thread:

How do open a file for writing only if it doesn't already exist in ruby

So maybe this whole question should be marked a duplicate.

Community
  • 1
  • 1
gwcoffey
  • 5,551
  • 1
  • 18
  • 20
  • 1
    The only issue may be that this is slightly deterministic, so if that is an issue, you still may wish to "random it up a little". – Brad Werth Aug 17 '14 at 02:59