41

How can one delete the very last line of a file with python?

Input File example:

hello
world
foo
bar

Output File example:

hello
world
foo

I've created the following code to find the number of lines in the file - but I do not know how to delete the specific line number.

    try:
        file = open("file")
    except IOError:
        print "Failed to read file."
    countLines = len(file.readlines())
jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
torger
  • 2,308
  • 4
  • 28
  • 35
  • 2
    Are you trying to actually remove the line from the file, on disk? If so, make sure you understand that files don't have "lines" from the filesystem's point of view. Lines are a convention of programmers and programs. What you see as a "line" is a sequence of bytes somewhere in the middle of lots of other bytes. To remove the last "line", you could truncate the file at the byte corresponding to the first character in the line. That's not difficult (you just have to find it), but there's not much point if the files involved are not many megabytes in size. – Peter Hansen Dec 10 '09 at 01:15
  • What if the last line is an empty line? – FogleBird Dec 10 '09 at 01:33
  • Last line is not blank. I remove all blank lines with another python snippet (from google). – torger Dec 10 '09 at 01:39
  • ? The file contains no blanks lines? The example above is what you should look on, nothing else. The last line is what I need to remove. Why the condescension? I've almost got it with Strawberry's answer. – torger Dec 10 '09 at 01:51
  • The file in question is not in memory - it is as is above. – torger Dec 10 '09 at 01:55
  • There was no condescension in my questions... just puzzlement, and maybe skepticism that you're doing this in a sensible manner. *You* wrote about the blank line removal. If the file is in memory, it's not a file, it's a list of strings. If you're already using Python on this "file" to remove blank lines, and this is an entirely separate step, then you're processing this data twice, inefficiently. These are all simple facts, but I'll stop now, if you don't like the help. – Peter Hansen Dec 10 '09 at 16:46

10 Answers10

86

Because I routinely work with many-gigabyte files, looping through as mentioned in the answers didn't work for me. The solution I use:

with open(sys.argv[1], "r+", encoding = "utf-8") as file:

    # Move the pointer (similar to a cursor in a text editor) to the end of the file
    file.seek(0, os.SEEK_END)

    # This code means the following code skips the very last character in the file -
    # i.e. in the case the last line is null we delete the last line
    # and the penultimate one
    pos = file.tell() - 1

    # Read each character in the file one at a time from the penultimate
    # character going backwards, searching for a newline character
    # If we find a new line, exit the search
    while pos > 0 and file.read(1) != "\n":
        pos -= 1
        file.seek(pos, os.SEEK_SET)

    # So long as we're not at the start of the file, delete all the characters ahead
    # of this position
    if pos > 0:
        file.seek(pos, os.SEEK_SET)
        file.truncate()
Andre Miras
  • 3,580
  • 44
  • 47
Saqib
  • 7,242
  • 7
  • 41
  • 55
  • 4
    this is the best answer. use "with" statement to save a line :) – cppython Feb 25 '15 at 02:18
  • 6
    I ran into some compatibility issues (using Py3) when using this method on files that were used on both mac and windows, because internally Mac uses a different line terminator than Windows (which uses 2: cr and lf). The solution was to open the file in binary read mode ("rb+"), and search for the binary newline character b"\n". – JrtPec Oct 06 '16 at 15:49
  • If you open the file with `"a+"` instead of `"r+"`, can you skip the `file.seek(0, os.SEEK_END)`? – TheLizzard Jun 26 '23 at 18:09
23

You could use the above code and then:-

lines = file.readlines()
lines = lines[:-1]

This would give you an array of lines containing all lines but the last one.

Martin
  • 7,089
  • 3
  • 28
  • 43
  • 7
    Will this work well for large files? E.g. thousands of lines? – torger Dec 10 '09 at 01:03
  • 3
    It might not work well for files bigger than a megabyte or two. Depends on your definition of "well". It should be perfectly fine for any desktop use for a few thousand lines. – Paul McMillan Dec 10 '09 at 01:04
  • Well - Within a second or two. – torger Dec 10 '09 at 01:07
  • Is there no other way to directly delete a specific line? Or is an array the way to go? – torger Dec 10 '09 at 01:08
  • Nazarius: There isn't any way to delete a specific line. You can however truncate a file or append to it. Since you want to delete the last line, you can just truncate. – Laurence Gonsalves Dec 10 '09 at 01:17
  • @torger an option could be to use `os.system("sed '$d' file")` to run `sed`, at the point that a binary will work faster over big files and processing in general. Truncate file seems the most fastest way. Anyway, this question has many usefull options :) +1 for this question. – m3nda Dec 07 '15 at 21:42
  • Would this read the complete file from start to end? – alper Jun 02 '21 at 20:47
  • @alper Yes, in this example it would read all the lines into an array in memory. – Martin Jun 04 '21 at 14:28
11

This doesn't use python, but python's the wrong tool for the job if this is the only task you want. You can use the standard *nix utility head, and run

head -n-1 filename > newfile

which will copy all but the last line of filename to newfile.

Peter
  • 127,331
  • 53
  • 180
  • 211
7

Assuming you have to do this in Python and that you have a large enough file that list slicing isn't sufficient, you can do it in a single pass over the file:

last_line = None
for line in file:
    if last_line:
        print last_line # or write to a file, call a function, etc.
    last_line = line

Not the most elegant code in the world but it gets the job done.

Basically it buffers each line in a file through the last_line variable, each iteration outputs the previous iterations line.

Dan Head
  • 2,672
  • 1
  • 17
  • 10
5

here is my solution for linux users:

import os 
file_path = 'test.txt'
os.system('sed -i "$ d" {0}'.format(file_path))

no need to read and iterate through the file in python.

Moj
  • 6,137
  • 2
  • 24
  • 36
3

On systems where file.truncate() works, you could do something like this:

file = open('file.txt', 'rb')
pos = next = 0
for line in file:
  pos = next # position of beginning of this line
  next += len(line) # compute position of beginning of next line
file = open('file.txt', 'ab')
file.truncate(pos)

According to my tests, file.tell() doesn't work when reading by line, presumably due to buffering confusing it. That's why this adds up the lengths of the lines to figure out positions. Note that this only works on systems where the line delimiter ends with '\n'.

Laurence Gonsalves
  • 137,896
  • 35
  • 246
  • 299
1

Here's a more general memory-efficient solution allowing the last 'n' lines to be skipped (like the head command):

import collections, fileinput
def head(filename, lines_to_delete=1):
    queue = collections.deque()
    lines_to_delete = max(0, lines_to_delete) 
    for line in fileinput.input(filename, inplace=True, backup='.bak'):
        queue.append(line)
        if lines_to_delete == 0:
            print queue.popleft(),
        else:
            lines_to_delete -= 1
    queue.clear()
Ned Deily
  • 83,389
  • 16
  • 128
  • 151
1

Inspiring from previous posts, I propound this:

with open('file_name', 'r+') as f:
  f.seek(0, os.SEEK_END) 
  while f.tell() and f.read(1) != '\n':
    f.seek(-2, os.SEEK_CUR)
  f.truncate()
0

Though I have not tested it (please, no hate for that) I believe that there's a faster way of going it. It's more of a C solution, but quite possible in Python. It's not Pythonic, either. It's a theory, I'd say.

First, you need to know the encoding of the file. Set a variable to the number of bytes a character in that encoding uses (1 byte in ASCII). CHARsize (why not). Probably going to be 1 byte with an ASCII file.

Then grab the size of the file, set FILEsize to it.

Assume you have the address of the file (in memory) in FILEadd.

Add FILEsize to FILEadd.

Move backwords (increment by -1***CHARsize**), testing each CHARsize bytes for a \n (or whatever newline your system uses). When you reach the first \n, you now have the position of the beginning of the first line of the file. Replace \n with \x1a (26, the ASCII for EOF, or whatever that is one your system/with the encoding).

Clean up however you need to (change the filesize, touch the file).

If this works as I suspect it would, you're going to save a lot of time, as you don't need to read through the whole file from the beginning, you read from the end.

Isaac
  • 15,783
  • 9
  • 53
  • 76
  • Note that the whole \x1a (aka ^Z aka CTRL-Z aka EOF, which is actually SUB in ASCII) thing is totally last century... very few text files are terminated with an actual SUB character any more, and even those are pretty much limited to Windows/DOS systems. And CPM I think. – Peter Hansen Dec 10 '09 at 01:47
  • Ah good point - I wasn't sure if it was still in widespread use... can something else be used to salvage this technique? – Isaac Dec 10 '09 at 02:19
0

here's another way, without slurping the whole file into memory

p=""
f=open("file")
for line in f:
    line=line.strip()
    print p
    p=line
f.close()
ghostdog74
  • 327,991
  • 56
  • 259
  • 343