0

I recently came across an answer that uses the code below to remove last line from big file using Python, it is very fast and efficient but I cannot make it work to delete first line from a file. Can anyone please help? Here is that answer https://stackoverflow.com/a/10289740/9311781 Below is the code:

with open(sys.argv[1], "r+", encoding = "utf-8") as file:

# Move the pointer (similar to a cursor in a text editor) to the end of the file
file.seek(0, os.SEEK_END)

# This code means the following code skips the very last character in the file -
# i.e. in the case the last line is null we delete the last line
# and the penultimate one
pos = file.tell() - 1

# Read each character in the file one at a time from the penultimate
# character going backwards, searching for a newline character
# If we find a new line, exit the search
while pos > 0 and file.read(1) != "\n":
    pos -= 1
    file.seek(pos, os.SEEK_SET)

# So long as we're not at the start of the file, delete all the characters ahead
# of this position
if pos > 0:
    file.seek(pos, os.SEEK_SET)
    file.truncate()
  • 1
    Removing the last line is efficient, because `truncate` operation is efficient. Removing the first line requires rewriting the whole file, from the second line up to the end. – Wojciech Kaczmarek Jun 22 '22 at 12:08
  • Deleting the first line is fundamentally different from deleting the last. The code you have starts at the end of the file and looks backwards for a line-end, then truncates the end of the file. It will never work for first line. – Mark Setchell Jun 22 '22 at 12:08
  • 1
    Duplicate of https://stackoverflow.com/questions/54625409/how-to-effectively-truncate-the-head-of-the-file – Wouter Jun 22 '22 at 12:12

1 Answers1

1

As comments already mentioned - Removing last line like that is easy because you can "truncate" the file from a given position - and afaik, that works across multiple operating systems and filesystems.

However, similary truncating from the start of the file is not standard operation in many filesystems. Linux does support this on new enough kernels (>3.15 afaik) and Mac's might have something similar too.

You could try to use Fallocate package from pypi[0] - or implement something similar by using the underlying syscall[1] if your os/filesystem is compatible.

rasjani
  • 7,372
  • 4
  • 22
  • 35