101

I wish to write to a file based on whether that file already exists or not, only writing if it doesn't already exist (in practice, I wish to keep trying files until I find one that doesn't exist).

The following code shows a way in which a potentially attacker could insert a symlink, as suggested in this post in between a test for the file and the file being written. If the code is run with high enough permissions, this could overwrite an arbitrary file.

Is there a way to solve this problem?

import os
import errno

file_to_be_attacked = 'important_file'

with open(file_to_be_attacked, 'w') as f:
    f.write('Some important content!\n')

test_file = 'testfile'

try:
    with open(test_file) as f: pass
except IOError, e:

    # Symlink created here
    os.symlink(file_to_be_attacked, test_file)

    if e.errno != errno.ENOENT:
        raise
    else:
        with open(test_file, 'w') as f:
            f.write('Hello, kthxbye!\n')
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Henry Gomersall
  • 8,434
  • 3
  • 31
  • 54

3 Answers3

101

Edit: See also Dave Jones' answer: from Python 3.3, you can use the x flag to open() to provide this function.

Original answer below

Yes, but not using Python's standard open() call. You'll need to use os.open() instead, which allows you to specify flags to the underlying C code.

In particular, you want to use O_CREAT | O_EXCL. From the man page for open(2) under O_EXCL on my Unix system:

Ensure that this call creates the file: if this flag is specified in conjunction with O_CREAT, and pathname already exists, then open() will fail. The behavior of O_EXCL is undefined if O_CREAT is not specified.

When these two flags are specified, symbolic links are not followed: if pathname is a symbolic link, then open() fails regardless of where the symbolic link points to.

O_EXCL is only supported on NFS when using NFSv3 or later on kernel 2.6 or later. In environments where NFS O_EXCL support is not provided, programs that rely on it for performing locking tasks will contain a race condition.

So it's not perfect, but AFAIK it's the closest you can get to avoiding this race condition.

Edit: the other rules of using os.open() instead of open() still apply. In particular, if you want use the returned file descriptor for reading or writing, you'll need one of the O_RDONLY, O_WRONLY or O_RDWR flags as well.

All the O_* flags are in Python's os module, so you'll need to import os and use os.O_CREAT etc.

Example:

import os
import errno

flags = os.O_CREAT | os.O_EXCL | os.O_WRONLY

try:
    file_handle = os.open('filename', flags)
except OSError as e:
    if e.errno == errno.EEXIST:  # Failed as the file already exists.
        pass
    else:  # Something unexpected went wrong so reraise the exception.
        raise
else:  # No exception, so the file must have been created successfully.
    with os.fdopen(file_handle, 'w') as file_obj:
        # Using `os.fdopen` converts the handle to an object that acts like a
        # regular Python file object, and the `with` context manager means the
        # file will be automatically closed when we're done with it.
        file_obj.write("Look, ma, I'm writing to a new file!")
Community
  • 1
  • 1
me_and
  • 15,158
  • 7
  • 59
  • 96
  • 2
    +1 for the obviously correct answer. I'm personally curious to know how many people actually have issues with the NFS caveat—I (perhaps recklessly) dismiss it as an obsolete environment my code should never be run on. – Mattie Jun 11 '12 at 11:59
  • 3
    @zigg: NFSv3 is from 1995, so it seems fair to regard older versions as obsolete. – Fred Foo Jun 11 '12 at 12:20
  • 1
    I'd be more worried about the kernel version, personally. If you're running anything even vaguely resembling an up-to-date system, you should have no issue, but RHEL 3 (still in extended support phase) is running a 2.4 kernel, for example. Also, I've not investigated if they provide atomic writes on Windows on FAT or NTFS, which is a potentially major limitation. – me_and Jun 11 '12 at 12:38
  • (Although the OP talks about `os.symlink()`, which is Unix-only, so Windows is presumably not so much of an issue for them.) – me_and Jun 11 '12 at 12:40
  • 1
    @me_and The python page on [open flag constants](http://docs.python.org/library/os.html#open-constants) suggests that this works fine with Windows. I'll be trying it shortly! – Henry Gomersall Jun 11 '12 at 12:42
  • 1
    True, but I've not seen anywhere (including [MSDN](http://msdn.microsoft.com/en-us/library/53xa7z70(v=VS.71).aspx)) that explicitly says these flags give *atomic* file creation. Possibly I'm being overly paranoid, but I'd want to see that "atomic" keyword before trusting this for anything that's security-critical. – me_and Jun 11 '12 at 13:36
  • @me_and could you add an example please? – 030 Oct 30 '14 at 10:19
  • @me_and could you also do this: f=open('file.txt.','a') f.close() f=open('file.txt','r+') so python creates a file only if it doesnt exist, then closes and opens the [old or brand new] file ...? – pinhead Jan 14 '16 at 22:09
  • @pinhead Welcome to Stack Overflow! You're better asking questions in new questions, because they're more visible and you're not limited by comment formatting or character limits. In any case, your code doesn't solve the problem, where we explicitly want to open the file if _and only if_ it doesn't already exist. – me_and Jan 15 '16 at 13:56
  • This solution works just for a file, i.e. it does not work if you use a full path and the parent directory does not exist. – Petr Krampl Dec 07 '17 at 14:09
  • 1
    @zigg NFS is widely used in High Performance Computing environments (HPC), where Python is a popular choice. Although I have never seen anything else than NFSv3. You should expect that every line of code you send to PyPi will be run on NFSv3 at some point. – Calimo Sep 24 '19 at 09:41
87

For reference, Python 3.3 implements a new 'x' mode in the open() function to cover this use-case (create only, fail if file exists). Note that the 'x' mode is specified on its own. Using 'wx' results in a ValueError as the 'w' is redundant (the only thing you can do if the call succeeds is write to the file anyway; it can't have existed if the call succeeds):

>>> f1 = open('new_binary_file', 'xb')
>>> f2 = open('new_text_file', 'x')

For Python 3.2 and below (including Python 2.x) please refer to the accepted answer.

Dave Jones
  • 1,080
  • 11
  • 11
  • 1
    Good suggestion. Unfortunately this appears to be POSIX-only (doesn't work on Windows): `Python 3.2 (r32:88445, Feb 20 2011, 21:30:00)` `[MSC v.1500 64 bit (AMD64)] on win32` `>>> open("c:/temp/foo.csv","wx")` `ValueError: invalid mode: 'wx'` – Dan Lenski Mar 05 '15 at 16:28
  • 6
    You're using python 3.2; the 'x' mode is in 3.3 and above but it is cross platform. Incidentally, you only use 'x' instead of 'wx' - the write mode is redundant as the only thing you could do with the file is write to it anyway – Dave Jones Mar 08 '15 at 13:19
  • 1
    How are 'w' and 'x' duplicates? It's perfectly reasonable to open an existing file for writing (which overwrites it). – Dubslow Nov 25 '17 at 01:45
  • 5
    It is reasonable to open an existing file for writing, but the entire point of the 'x' mode is to open the file *if and only if it doesn't exist already*, failing with an error when the file does exist. This is why it is redundant with the 'w' flag; if it succeeds the file is guaranteed to be empty (and hence there's very little point reading from it :). – Dave Jones Nov 26 '17 at 16:27
-2

This code will easily create a file if one does not exists.

import os
if not os.path.exists('file'):
    open('file', 'w').close() 
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
user2033758
  • 1,848
  • 3
  • 16
  • 16
  • 18
    Yes it will. The important point about the question was the safety aspect. The problem is that between identifying the presence of the file and using it or creating it, something might change that results in a bad outcome (as in the original question). – Henry Gomersall Mar 14 '13 at 09:49
  • 6
    That's true. It's called TOCTOU! – Rad Mar 27 '16 at 05:47
  • 1
    If another process creates and writes to the file after the `if` statement, this code will blank out the file. – Peter Wood Apr 06 '17 at 10:40