Why is checking and then immediately opening a file risky?

Question

Reading the answer in this question: How do I check whether a file exists using Python?, where the answer states this:

If the reason you're checking is so you can do something like if file_exists: open_it(), it's safer to use a try around the attempt to open it. Checking and then opening risks the file being deleted or moved or something between when you check and when you try to open it.

If you're not planning to open the file immediately, you can use os.path.isfile

I couldn't understand why checking (through os.path) and then opening risks the file being moved or deleted.

What exactly does this mean?

The main principle here is the atomicity of the operations. Attempting to open is a single operation. Checking whether the file exists, and then opening it is 2 _separate_ operations, and no guarantees are made as to nothing happening between their execution. — cs95, Jan 03 '18 at 11:47
(Thinking) Can a file exist and yet not be opened? [The most usual reason a file "can't be opened" is because it does not exist. But still. This question is rather impossible to Google.] — Jongware, Jan 03 '18 at 11:54
Ah *of course* (after thinking a bit). Actually, *countless* situations: no read priviileges; no write privileges (when opened for writing); no memory to allocate a file descriptor; no more room for a new file (when creating); specified path is actually a directory. See [`man open`](https://www.freebsd.org/cgi/man.cgi?query=open&sektion=2&apropos=0&manpath=FreeBSD+11.1-RELEASE+and+Ports#end) for many more ... — Jongware, Jan 03 '18 at 13:22

score 5 · Accepted Answer · answered Jan 03 '18 at 11:54

You misunderstood what the answer said. It does not say that checking with os.path and then opening risks deletion.

What it actually says that checking whether the file exists, and subsequently opening it with two separate steps does not guarantee that the file would still exist at the time of opening.

The principle here is the atomicity of operations. This is a single operation -

try:
    with open(filename) as f:
        ... # do something     
except OSError:
    ... # do something else

An attempt is made to open the file in one atomic operation. If the file is not found, the error will be caught and handled.

However! This code -

if os.path.isfile(filename):
    with open(filename) as f:
        ... # do something

Does two things -

Checks whether the file exists
Opens the file

These are two, non-atomic operations, and nothing guarantees that the file could not have been removed/deleted between the first step and the second. That makes this a less safe option as far as code safety is concerned.

What you should use depends on your use case. Like the answer says, if you don't plan on opening the file immediately, just using os.path.isfile is good enough.

score 2 · Answer 2 · answered Jan 03 '18 at 11:50

No one can guarantee, what happens to the file after you have checked its existence and before you actually open it. It can be deleted, renamed or moved for example.

Hence the check does not provide you with any additional, useful info. You know the file existed at some point of time, but you would still need to encapsulate your file open statement with try/except to ensure against this possibility.

And when you do that, you do not actually need the existence check information, as your exception handling does that for you anyway.

If you are in full control of the environment and you can absolutely guarantee that another process or user does not tamper with the file, then you are safe with your check and open. In my opinion it would still be bad programming, as you set a constraint to the system "no one else may touch this file" without actually enforcing it in any way, and you might run into problems later.

Why is checking and then immediately opening a file risky?

2 Answers2