9

This is something I think must come up quite often but I haven't been able to find a good solution for it. Say I have a function which may be passed an open resource as an argument (like a file or database connection object) or needs to create one itself. If the function needs to open a file on its own, best practice is usually considered something like:

with open(myfile) as fh:
    # do stuff with open file handle...

to ensure the file is always closed when the with block is exited. However if an existing file handle is passed in the function should probably not close it itself.

Consider the following function which takes either an open file object or a string giving a path to the file as its argument. If it is passed a file path it should probably be written as above. Otherwise the with statement should be omitted. This results in duplicate code:

def foo(f):
    if isinstance(f, basestring):
        # Path to file, need to open
        with open(f) as fh:
            # do stuff with fh...
    else:
        # Assume open file
        fh = f
        # do the same stuff...

This could of course be avoided by defining a helper function and calling it in both places, but this seems inelegant. A better way I thought of was to define a context manager class that wraps an object like so:

class ContextWrapper(object):
    def __init__(self, wrapped):
        self.wrapped = wrapped
    def __enter__(self):
        return self.wrapped
    def __exit__(self, *args):
        pass

def foo(f):
    if isinstance(f, basestring):
        cm = open(f)
    else:
        cm = ContextWrapper(f)

    with cm as fh:
        # do stuff with fh...

This works but unless there's a built-in object that does this (I don't think there is) I either have to copy-paste that object everywhere or always have to import my custom utilities module. I'm feeling like there's a simpler way to do this that I've missed.

JaredL
  • 1,372
  • 2
  • 11
  • 27
  • I can't think of a good reason to write code that would accept *either* a path *or* an open file handle. In that edge case, I'd recommend writing your own wrapper (as you've done) – Adam Smith Nov 26 '15 at 00:18
  • 4
    I'd argue that a helper function might be *more* elegant. A lot of the work you're doing in `foo` is to get the argument into the right type of object -- an open file handle. Factor out the code that does the core work into a helper that assumes an open file handle, and I think the overall result is more clear. Actually, the "helper" might be made a legitimate, public function in its own right, and "foo_from_path" simply calls "foo" after opening a file handler. – jme Nov 26 '15 at 00:18
  • 1
    @AdamSmith For what it's worth, this is a common design in, say, numpy. For example, `np.load` takes either a [string or a fileobj](http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.load.html). I suppose it's just there to make some code more concise, and since numpy is often used interactively, it's not unwise. – jme Nov 26 '15 at 00:22
  • @jme thanks for the context! :) – Adam Smith Nov 26 '15 at 00:23
  • Python 3.x has a more elegant solution, see: https://stackoverflow.com/questions/41251850 – ideasman42 Oct 14 '21 at 00:36

3 Answers3

3

That "context manager wrapper" idea is the right way to go. Not only is it lighterweight written as a function:

from contextlib import contextmanager

@contextmanager
def nullcontext(obj=None):
    yield obj

A sign that it's the right choice: the purpose-built nullcontext is available in the stdlib as of 3.7 (and not at time of asking/for python-2.7), with your exact use-case in the docs:

def process_file(file_or_path):
    if isinstance(file_or_path, str):
        # If string, open file
        cm = open(file_or_path)
    else:
        # Caller is responsible for closing file
        cm = nullcontext(file_or_path)

    with cm as file:
        # Perform processing on the file
Kache
  • 15,647
  • 12
  • 51
  • 79
1

However, I prefer, I don't know how pythonic it is, but it's straightforward

def foo(f):
    if isinstance(f, basestring):
        f = open(f)
    try:
        # do the stuff
    finally:
        f.close()

the problem could be solved nicer with singledispatch from python 3.4

from functools import singledispatch

@singledispatch
def foo(fd):
    with fd as f:
        # do stuff
        print('file')

@foo.register(str)
def _(arg):
    print('string')
    f = open(arg)
    foo(f)


foo('/tmp/file1')  # at first calls registered func and then foo
foo(open('/tmp/file2', 'r'))  # calls foo
kwarunek
  • 12,141
  • 4
  • 43
  • 48
  • If an open handle is given as input and an exception is thrown within `try:`, this will close the handle. That's fine, but a little strange I think -- does the function really have the authority to do that? One way to avoid this -- and this is what numpy does -- is to keep track of whether the function "owns" the file handle via boolean `f_own`. Then, in the `finally:` block, only close if `f_own` is True. – jme Nov 26 '15 at 00:33
0

This solution avoids an explicit boolean value like f_own (that is mentioned in a comment of the answer from @kAlmAcetA), and instead just checks the identity of the input parameter f to the file handle fh. A try/finally clause is the only way to do this without creating a helper class as a context manager.

def foo(f):
    fh = open(f) if isinstance(f, basestring) else f

    try:
        # do stuff...
    finally:
        if fh is not f:
            fh.close()

If you need to do something like this in more than one function then, yes, you probably should create a utility module with a context manager class to do it, like this:

class ContextWrapper(object):
    def __init__(self, file):
        self.f = file

    def __enter__(self):
        self.fh = open(self.f) if isinstance(self.f, basestring) else self.f
        return self.fh

    def __exit__(self, *args):
        if self.fh is not self.f:
            self.fh.close()

Then you could unconditionally wrap like this:

def foo(f):
    with ContextManager(f) as fh:
        # do stuff...
eestrada
  • 1,575
  • 14
  • 24