39

I have a temporary file with some content and a python script generating some output to this file. I want this to repeat N times, so I need to reuse that file (actually array of files). I'm deleting the whole content, so the temp file will be empty in the next cycle. For deleting content I use this code:

def deleteContent(pfile):

    pfile.seek(0)
    pfile.truncate()
    pfile.seek(0) # I believe this seek is redundant

    return pfile

tempFile=deleteContent(tempFile)

My question is: Is there any other (better, shorter or safer) way to delete the whole content without actually deleting the temp file from disk?

Something like tempFile.truncateAll()?

bartimar
  • 3,374
  • 3
  • 30
  • 51
  • 1
    The second seek is indeed redundant. Why not just create a **new** temporary file? – Martijn Pieters Jun 15 '13 at 17:17
  • Because for one common script run I will then need like ~400 temporary files instead of ~10. So I think it's better to recycle them. Am I wrong? – bartimar Jun 15 '13 at 17:36
  • Have you run into any actual problems? I'd just create new temporary files and let Python and the OS clean up the ones I closed. – Martijn Pieters Jun 15 '13 at 17:39
  • Actually deleting and closing them would be more lines of confusing code. I don't have problems with my solution, I just need to know more ways how to do it and test the performance (while letting the code simple). – bartimar Jun 15 '13 at 17:44
  • 2
    If you are using the [`tempfile` module](http://docs.python.org/2/library/tempfile.html) you don't need to delete *anything*. Use the temporary file as a context manager (`with ...`) and it'll be closed automatically as well. – Martijn Pieters Jun 15 '13 at 17:45

5 Answers5

82

How to delete only the content of file in python

There are several ways of setting the logical size of a file to 0, depending how you access that file:

To empty an open file:

def deleteContent(pfile):
    pfile.seek(0)
    pfile.truncate()

To empty an open file whose file descriptor is known:

def deleteContent(fd):
    os.ftruncate(fd, 0)
    os.lseek(fd, 0, os.SEEK_SET)

To empty a closed file (whose name is known)

def deleteContent(fName):
    with open(fName, "w"):
        pass


I have a temporary file with some content [...] I need to reuse that file

That being said, in the general case it is probably not efficient nor desirable to reuse a temporary file. Unless you have very specific needs, you should think about using tempfile.TemporaryFile and a context manager to almost transparently create/use/delete your temporary files:

import tempfile

with tempfile.TemporaryFile() as temp:
     # do whatever you want with `temp`

# <- `tempfile` guarantees the file being both closed *and* deleted
#     on the exit of the context manager
The Amateur Coder
  • 789
  • 3
  • 11
  • 33
Sylvain Leroux
  • 50,096
  • 7
  • 103
  • 125
  • `pfile.truncate(0)` won't reset the file pointer, so you'll need to do a `pfile.seek(0)` either way. Same applies to `os.ftruncate()`. FWIW, you can get the file descriptor from `pfile.fileno()`, so `os.ftruncate(pfile.fileno(), 0)` would work, but you'd still need to do the `pfile.seek(0)` afterwards. – Aya Jun 15 '13 at 17:33
  • 2
    From http://docs.python.org/2/library/stdtypes.html#file.truncate `Note that if a specified size exceeds the file’s current size, the result is platform-dependent: possibilities include that the file may remain unchanged, increase to the specified size as if zero-filled, or increase to the specified size with undefined new content. ` That is why I didn't do it. – bartimar Jun 15 '13 at 17:34
  • I was indeed looking at that doc right now. I understand that the file pointer could stay at its position if it is still valid (i.e.: pointing before the new logical end of the file). But what is we truncate the file before the current position? So I made the test. On Linux, `truncate(0)` don't move the current position as reported by `ftell()`-- but subsequent write are made at the beginning of the file as expected. – Sylvain Leroux Jun 15 '13 at 17:41
  • @bartimar Ultimately, those will just call one of [`truncate(2)`](http://linux.die.net/man/2/truncate) or `ftruncate(2)`, so the manpage is probably the best documentation. – Aya Jun 15 '13 at 17:42
  • 1
    @SylvainLeroux Not for me it doesn't. `f = open('foo', 'wb'); f.write('foo'); f.truncate(0); f.write('foo'); print f.tell()` prints `6`. – Aya Jun 15 '13 at 17:44
  • @Aya Sorry, I haven't made myself clear enough: after `truncate(0)` `tell()` reports a position "past the end" of the file. But if you `flush` or `close` your file and examine if from the outside word, you will see new content is written at the beginning of the file. As expected. `f = open('foo', 'wb'); f.write('Hello'); f.truncate(0); print f.tell(); f.write('Bonjour'); print f.tell(); fclose()` will report `5` and `12` resp. at `tell()` -- but the content of the file will be `bonjour`. – Sylvain Leroux Jun 15 '13 at 17:50
  • 1
    @SylvainLeroux For me the content is `"\x00\x00\x00\x00\x00Bonjour"`. Do an `xxd` on `foo` to check. So, in effect, you're creating a [sparse file](http://en.wikipedia.org/wiki/Sparse_file). – Aya Jun 15 '13 at 17:53
  • @Aya That's funny, because I don't have sparse bytes at the beginning of my file (Python 2.6, Ext3 file system) ?!? But the behavior _you_ observe is more what I had expected first. Strange we don't have the same result here... – Sylvain Leroux Jun 15 '13 at 17:56
  • @SylvainLeroux It may be OS-dependent. I'm using Ubuntu 13.04, Python 2.7.4, and ext4. – Aya Jun 15 '13 at 17:58
  • @Aya Ah ah! I've got a sparse bytes if I do the test on a file opened as binary `wb`. But not if I open the file as text `wt`. Could you confirm that? – Sylvain Leroux Jun 15 '13 at 17:59
  • 2
    @SylvainLeroux I get the leading NULLs either way. Linux ignores the `b` flag anyway. From [`fopen(3)`](http://linux.die.net/man/3/fopen)... "The mode string can also include the letter 'b' either as a last character or as a character between the characters in any of the two-character strings described above. This is strictly for compatibility with C89 and has no effect; the 'b' is ignored on all POSIX conforming systems, including Linux." – Aya Jun 15 '13 at 18:03
  • @Aya OK, I'm going crazy -- or it is really time to go to bed? Anyway you are right. I don't know what I have done previously, but by re-testing carefully, I obtain a sparse file in both cases. Sorry to have wasting your time ;) I removed the "truncate without seek" from my answer. – Sylvain Leroux Jun 15 '13 at 18:08
  • sorry going in very late, but is there no way to delete without having to call `.seek(0)` and also delete the contents from the beginning? – Charlie Parker Feb 07 '17 at 14:00
  • @CharlieParker: The `seek(0)` isn't necessary if you don't write anything to the file after (when you want to empty the file immediately before closing it); in that case, just `fileobj.truncate(0)` is enough. But if you are going to write to the file after, the `seek` is necessary, otherwise you get non-portable behavior (e.g. the sparse files mentioned in this comment thread). `.seek(0)` followed by `.truncate()` (or `.truncate(0)` to save an `lseek` call under the hood) both truncates and ensures the file acts like a normal, newly opened empty file. – ShadowRanger Mar 13 '19 at 16:16
  • @SylvainLeroux: Minor note: Your "To empty a closed file (whose name is known)" solution will *create* the empty file if it doesn't exist. A solution that doesn't create the file (raises exception if it doesn't exist), and doesn't require setting up a Python level file object is just `def deleteContent(fName): os.truncate(fName, 0)`. Requires Python 3.3+ though. – ShadowRanger Mar 13 '19 at 16:20
7

I think the easiest is to simply open the file in write mode and then close it. For example, if your file myfile.dat contains:

"This is the original content"

Then you can simply write:

f = open('myfile.dat', 'w')
f.close()

This would erase all the content. Then you can write the new content to the file:

f = open('myfile.dat', 'w')
f.write('This is the new content!')
f.close()
Peaceful
  • 4,920
  • 15
  • 54
  • 79
2

What could be easier than something like this:

import tempfile

for i in range(400):
    with tempfile.TemporaryFile() as tf:
        for j in range(1000):
            tf.write('Line {} of file {}'.format(j,i))

That creates 400 temp files and writes 1000 lines to each temp file. It executes in less than 1/2 second on my unremarkable machine. Each temp file of the total is created and deleted as the context manager opens and closes in this case. It is fast, secure, and cross platform.

Using tempfile is a lot better than trying to reinvent it.

dawg
  • 98,345
  • 23
  • 131
  • 206
  • 1
    I think that `seek(0)` and `truncate()` without for cycle is actually easier, better, (maybe faster), and nicer to OS/python :) I was afraid that someone get caught on the reusing/recycling... Still my question is the same, so this actually isn't the answer. – bartimar Jun 15 '13 at 18:46
  • 2
    Have you tested that assumption? Have you timed it to see? – dawg Jun 15 '13 at 19:38
2

You can do this:

def deleteContent(pfile):
    fn=pfile.name 
    pfile.close()
    return open(fn,'w')
the wolf
  • 34,510
  • 13
  • 53
  • 71
0
with open(Test_File, 'w') as f:
    f.truncate(0)

I found this way easy. You may try this.

Arti
  • 293
  • 2
  • 3
  • 10