If you have no control over the writing process, then you are at some point bound to fail somewhere.
If you do have control over the writer, a simple way to "lock" files is to create a symlink. If your symlink creation fails, there is already a write in progress. If it succeeds, you just acquired the "lock".
But if you do not have any control over writing and creation of the file, there will be trouble. You can try the approach as outlined here: Ensuring that my program is not doing a concurrent file write
This will read timestamps of the file and "guess" from them if writing has completed or not. This is more reliable than checking the file size, as you could end up with a file over your size threshold but writing still in progress.
In this case the problem would be the writer starting to write before you have read the file in its entirety. Now your reader would fail when the file it was reading disappeared half way through.
If you are on a Unix platform, you have no control over write and you absolutely need to do this, I would do something like this:
- Check if file exists and if it does, if the "last written" timestamp
is "old enough" for me to assume the file is there
- Rename the file to a different name
- Check the renamed file that it still matches your criteria
- Get data from the renamed file
Nevertheless, this will eventually fail and you will lose an update, as there is no way to make this atomic. Renaming will remove the problem of overwriting the file before you have read it, but if the writer decides to start writing between 1 and 2, you not only will receive an incomplete file but you might also break the writer if it does not like the file disappearing half way through.
I would rather try to find a way to somehow chain the actions together. Either your writer triggering the read process or adding a locking mechanism. Writing 1.5GB of data is not instantaneous and eventually the unexpected will happen.
Or if you definitely cannot do anything like that, could you ensure for example that your writer writes maximum once in N minutes or so? If you could be sure it never writes twice within a 5 minute window, you would wait in your reader until the file is 3 minutes old and then rename it and read the renamed file. You could also check if you could prevent the writer from overwriting. If you can do this, then you can safely process the file in your reader when it is "old enough" and has not changed in whatever grace period you decide to give it, and when you have read it, you will delete the file allowing the next update to appear.
Without knowing more about your environment and processes involved this is the best I can come up with. But there is no universal solution to this problem. It needs a workaround that is tailored to your particular environment.