4

I'm trying to write a series of functions that writes to a temporary file, then does stuff with the written file. I'm trying to understand how the file is dealt with.

What I'd like to do in the abstract is:

def create_function(inputs):
    # create temp file, write some contents

def function1(file):
    # do some stuff with temp file

def function2(file):
    # do some other stuff with temp file

So that I can do something like:

my_file = create_function(my_inputs)
function1(my_file)
function2(my_file)

So here's what I've actually done:

def db_cds_to_fna(collection, open_file):
    """
    This pulls data from a mongoDB and writes it to a temporary file - just writing an arbitrary string doesn't alter my question (I'm pretty sure)
    """
    for record in db[collection].find({"type": "CDS"}):
        open_file.write(">{}|{}|{}\n{}\n".format(
            collection,
            record["_id"],
            record["annotation"],
            record["dna_seq"]
            )
        )

    return open_file.name

def check_file(open_file):
    lines = 0
    for line in open_file:
        if lines < 5:
            print line
            lines += 1
        else:
            break

With this code, if I run the following:

from tempfile import NamedTemporaryFile
tmp_file = NamedTemporaryFile()
tmp_fna =  db_cds_to_fna('test_collection', tmp_file)

check_file(tmp_file)

This code runs, but doesn't actually print anything. But the file is clearly there and written, because if I run print Popen(['head', tmp_fna], stdout=PIPE)[0], I get the expected beginning of the file. Or, if I change check_file() to accept the tmp_file.name and do with open(tmp_file.name, 'r')... inside the function, it works.

So question 1 is - why can I write to the tmp_file, but can't read it from a different function without re-opening it?

Now, what I'd really like to do is have the tmp_file = NamedTemporaryFile() inside the db_cds_to_fna() function, but when I try this and run:

tmp_fna =  db_cds_to_fna('test_collection')
check_file(tmp_file)

I get an error No such file or folder

So question 2 is: is there any way to keep the temporary file around for another function to use? I know how to just write a file to a specified path and then delete it, but I suspect there's a built in way to do this and I'd like to learn.

kevbonham
  • 999
  • 7
  • 24

1 Answers1

2

You're writing to the file, but then you're attempting to read it from the end of your writes. Add a seek before you start reading, to go back to the beginning of the file:

def check_file(open_file):
    lines = 0
    open_file.seek(0)
    for line in open_file:
        if lines < 5:
            print line
            lines += 1
        else:
            break

For your second question, note that NamedTemporaryFile works like TemporaryFile in that:

It will be destroyed as soon as it is closed (including an implicit close when the object is garbage collected).

If you open the file in a function and then return, the file goes out of scope and will be closed and garbage collected. You'll need to keep a reference to the file object alive in order to keep it from being collected. You can do this by returning the file object from the function (and making sure you assign it to something). Here's a trivial example:

def mycreate():
    return NamedTemporaryFile()
def mywrite(f, i):
    f.write(i)
def myread(f):
    f.seek(0)
    return f.read()

f = mycreate()        # 'f' is now a reference to the file created in the function, 
                      # which will keep it from being garbage collected
mywrite(f, b'Hi')
myread(f)
glibdud
  • 7,550
  • 4
  • 27
  • 37
  • Awesome, that makes sense, thank you. In one of my functions, I have to call an external program with `Popen`, so I think I need a named file right? But if I only needed the file within my python scrip, would using `TemporaryFile` rather than named make more sense? – kevbonham Feb 05 '16 at 16:50
  • If you can do everything you need to do without closing the file (or opening it elsewhere), then yes, `TemporaryFile` would make sense. In this case, since you're calling an external program that needs to open the file, yes, you'll want a `NamedTemporaryFile`. – glibdud Feb 05 '16 at 16:52
  • OK, I have something weird for you. I now have a function that takes the temporary file as an argument and then calls an external program on it. `ext_call(temp_file); #call external program`. When I do this, the program returns an error saying the file doesn't exist. But if, within the same function I do `print os.path.isfile(temp_file)`, it prints `True` AND the external program runs without issue. WTF is going on here? – kevbonham Feb 10 '16 at 22:00
  • @kevbonham Sounds like that would be best handled as a new question with new context. – glibdud Feb 11 '16 at 02:57
  • I thought it might, but wanted to be sure it wasn't some thing obvious... I will write out up in the morning, thanks – kevbonham Feb 11 '16 at 02:58
  • Well, never mind - seems I can't replicate the bug this morning. It's working fine now. Very odd. – kevbonham Feb 11 '16 at 15:31