-1

I would like to track the changes of a file being appended by another program.

My plan of approach is this. I read the file's contents first. At a later time, the file contents would be appended from another application. I want to read the appended data only rather than re-reading everything from the top. In order to do this, I'm going to check the file's modification time. I would then seek() to the previous size of the file, and start reading from there.

Is this a proper approach? Or there is a known idiom for this?

Sean Francis N. Ballais
  • 2,338
  • 2
  • 24
  • 42
noname7619
  • 3,370
  • 3
  • 21
  • 26

1 Answers1

1

Well, you have to make quite some assumptions about both the other program writing to file as well as the file system, but in generally it should work. Personally I would rather write the current seek position or line number (if reading simple text files) to another file and check it from there. This will also allow you to revert back in the file if some part is rewritten and the file size stays the same (or even gets smaller).

If you have some very important/unique data, besides making backups you should maybe think about appending the new data to new file and later rejoining the files (if needed) when you have checked that the data is fine in your other program. This way you could just read any new file as a whole after certain time. (Also remember that in a larger picture, system time and creation/modification times are not 100% trustworthy).

  • Thanks for your reply. I assume that other program appends to the end of file. What other assumptions do you mean about the program and the system? How saving position to additional file may help in case other program does modifies/reduces contents, not just appends (in my view a position/line no. will be broken)? – noname7619 Oct 02 '17 at 11:28
  • Well, it's a completely different case to read and write few (hundred/thousand) lines from/to a local computer than a bunch of writes coming from different sources to same file. There are issues of buffers/caching/file handles etc. to think in larger scale. Saving SEEK to other file could give you the positions of previous write(s) so you can rewind if needed. Information like "file size at start: xxx, attempting to write lines: xxx, managed to write lines: xxx" can be useful. A separate file-writer-api(tm) is usually the best way, instead of several programs writing to same location. – Stacking For Heap Oct 02 '17 at 12:30