0

I'm facing a problem using std::ifstream to read a file. I have a zip file called "nice_folder.zip" (71.6MB) and the following code that reproduces the issue:

#include <filesystem>
#include <fstream>
#include <iostream>
#include <memory>
#include <unistd.h>

int main() {

  size_t read = 0;
  size_t f_size = std::filesystem::file_size("nice_folder.zip");
  std::shared_ptr<char[]> buffer{new char[4096]};
  std::ifstream file{"nice_folder.zip"};
  while (read < f_size) {
    size_t to_read = (f_size - read) > 4096 ? 4096 : (f_size - read);
    file.read(buffer.get(), to_read);
    sleep(2);
    std::cout << "read: " << std::to_string(to_read) << "\n";
  }
}

The problem is the following: after some read cycles I delete the file from the folder but it keeps reading it anyway. How is it possible ? How can I catch an error if an user delete a file while reading using ifstream ? I guess that ifstream takes the content of the file into memory before start reading but i'm not sure.

GJCode
  • 1,959
  • 3
  • 13
  • 30
  • 4
    The behavior of what happens when you delete a file is OS specific. Some OSs will keep the file existing after the delete while it is still open in an application. Also c++ may have buffered at least part of the file – drescherjm Oct 30 '20 at 16:49
  • also its unclear why you consider the behavior you observe a "problem". Do you want to react on the file getting deleted while you read from it? – 463035818_is_not_an_ai Oct 30 '20 at 16:52
  • I strongly recommend checking the stream state for errors after every IO transaction. Without checking how do you know you're actually read anything? – user4581301 Oct 30 '20 at 16:52
  • @drescherjm at the best of my knowledge when you open a file on Linux you retrieve a filedescriptor and this is inserted in the per process open file table, i don't get why the OS should retrieve file content at this point or buffer into memory a so large content. – GJCode Oct 30 '20 at 16:58
  • The answer about linux is correct. – drescherjm Oct 30 '20 at 16:59
  • This doesn't address the question, but you don't need `std::to_string(to_read)` in the output statement. `to_read` alone is sufficient. Also, instead of repeating 4096, create a constant with that value and use the constant for the size of the buffer and the sizes in calculating `to_read`. That way, there's only one place you need to change if you want to change the buffer size. And, finally, for this particular example, `std::unique_ptr` seems more appropriate than `std::shared_ptr` to hold the buffer pointer. Or maybe just `std::array`. – Pete Becker Oct 30 '20 at 18:39

1 Answers1

3

If you're doing this on e.g. linux, then the OS will not actually delete the file until you've closed all file handles to it. So it might seem like the file is deleted, but it's still stored somewhere on disk. See e.g. What happens internally when deleting an opened file in linux.

So if you're trying to detect this file deletion to prevent wrong reads, then don't worry, the file won't actually be deleted.

If you close the stream, and then try and open it again for that file, you'll get an error. But that also means you won't be able to read from it...

AVH
  • 11,349
  • 4
  • 34
  • 43