1

Possible Duplicate:
C write in the middle of a binary file without overwriting any existing content

I am writing a program that occasionally needs to insert 1-64k of data at the beginning of a binary file. The POSIX API / Linux ABI does not have an insert(fd,buf,len) function call. What's the most efficient way to do this?

Community
  • 1
  • 1
vy32
  • 28,461
  • 37
  • 122
  • 246

1 Answers1

4

Your choices are:

  1. Create a new file, write the new data and copy the old data to the new file, then replace (the contents of) the old file with the new file.
  2. Read a block from the end of the file, write the block to its new position, repeatedly, working your way backwards through the file.

The advantage of (2) is that it doesn't break symlinks or multiple links to the original file. The disadvantage (as noted by Keith Thompson) is that if it is interrupted, you've lost your original file.

The disadvantage of (1) is that if you need to preserve numbers of links and work through symlinks, you have to copy the new file back over the old file, so there's more copying. The advantage is that the copying is simpler and the original file is not destroyed until the end.

There's another question with code for option (2) — Write in the middle of a binary file without overwriting any existing content. Inserting at the start of a (binary) file is just a specific (not even special) case of inserting in the middle of a file.

Community
  • 1
  • 1
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Thanks. You are correct that http://stackoverflow.com/questions/10467711/c-write-in-the-middle-of-a-binary-file-without-overwriting-any-existing-content is a correct solution. I didn't know about it and didn't find it with my searches. – vy32 Oct 20 '12 at 22:16
  • @vy32: I think that solution (from my brief reading) will just copy the entire contents of the file anyway. If you're inserting into the middle, it avoids copying the contents before the insertion point. If you're inserting at the beginning, that's not an advantage. – Keith Thompson Oct 20 '12 at 22:24
  • 1
    The second solution has the drawback that it can leave the file corrupted if it's interrupted before it finishes copying. – Keith Thompson Oct 20 '12 at 22:25
  • I'm still not sure that this gives me the answer of something that's the most efficient. Perhaps it would be more efficient to map the file into memory with mmap and then do a single write to the new location? – vy32 Oct 21 '12 at 02:42
  • Have at it...see what you can create as an alternative. There may be tricks with `mmap()` that you can use (one of the issues will be how to `mmap()` more space than is currently in the file). Ultimately, that will be similar to what the code does, but it may be able to do it more efficiently, if only because of fewer system calls. If the amount to be inserted is a multiple of the page size, the system can speed things up, perhaps, but probably wouldn't be aware enough. – Jonathan Leffler Oct 21 '12 at 02:49