How to implement somethig like the 'head' and 'tail' commands in python and backward read by lines of a text file?
Asked
Active
Viewed 3.9k times
10
-
1possible duplicate of [Read a file in reverse order using python](http://stackoverflow.com/questions/2301789/read-a-file-in-reverse-order-using-python) – Greg Hewgill May 05 '11 at 10:23
-
I need to backward read a big log file – user739650 May 05 '11 at 10:29
-
I'm guessing you're not familiar with [tac](http://www.gnu.org/software/coreutils/manual/html_node/tac-invocation.html) then, because your question would just be "Implement tac in python". – MattH May 05 '11 at 10:33
-
possible duplicate of [Get last n lines of a file with Python, similar to tail](http://stackoverflow.com/questions/136168/get-last-n-lines-of-a-file-with-python-similar-to-tail) – S.Lott May 05 '11 at 10:46
3 Answers
27
This is my personal file class ;-)
class File(file):
""" An helper class for file reading """
def __init__(self, *args, **kwargs):
super(File, self).__init__(*args, **kwargs)
self.BLOCKSIZE = 4096
def head(self, lines_2find=1):
self.seek(0) #Rewind file
return [super(File, self).next() for x in xrange(lines_2find)]
def tail(self, lines_2find=1):
self.seek(0, 2) #Go to end of file
bytes_in_file = self.tell()
lines_found, total_bytes_scanned = 0, 0
while (lines_2find + 1 > lines_found and
bytes_in_file > total_bytes_scanned):
byte_block = min(
self.BLOCKSIZE,
bytes_in_file - total_bytes_scanned)
self.seek( -(byte_block + total_bytes_scanned), 2)
total_bytes_scanned += byte_block
lines_found += self.read(self.BLOCKSIZE).count('\n')
self.seek(-total_bytes_scanned, 2)
line_list = list(self.readlines())
return line_list[-lines_2find:]
def backward(self):
self.seek(0, 2) #Go to end of file
blocksize = self.BLOCKSIZE
last_row = ''
while self.tell() != 0:
try:
self.seek(-blocksize, 1)
except IOError:
blocksize = self.tell()
self.seek(-blocksize, 1)
block = self.read(blocksize)
self.seek(-blocksize, 1)
rows = block.split('\n')
rows[-1] = rows[-1] + last_row
while rows:
last_row = rows.pop(-1)
if rows and last_row:
yield last_row
yield last_row
Example usage:
with File('file.name') as f:
print f.head(5)
print f.tail(5)
for row in f.backward():
print row

fdb
- 1,998
- 1
- 19
- 20
-
3Does anyone have a Python 3 version of this? I'm getting: NameError: name 'file' is not defined – Reddspark May 10 '17 at 10:54
6
head
is easy:
from itertools import islice
with open("file") as f:
for line in islice(f, n):
print line
tail
is harder if you don't want to keep the whole file in memory. If the input is a file, you could start reading blocks beginning at the end of the file. The original tail
also works if the input is a pipe, so a more general solution is to read and discard the whole input, except for the last few lines. An easy way to do this is collections.deque
:
from collections import deque
with open("file") as f:
for line in deque(f, maxlen=n):
print line
In both these code snippets, n
is the number of lines to print.

Sven Marnach
- 574,206
- 118
- 941
- 841
-
5very elegant but tail using deque with huge log files (hundreds of MB) is too slow – user739650 May 05 '11 at 17:03
0
Tail:
def tail(fname, lines):
"""Read last N lines from file fname."""
f = open(fname, 'r')
BUFSIZ = 1024
f.seek(0, os.SEEK_END)
fsize = f.tell()
block = -1
data = ""
exit = False
while not exit:
step = (block * BUFSIZ)
if abs(step) >= fsize:
f.seek(0)
exit = True
else:
f.seek(step, os.SEEK_END)
data = f.read().strip()
if data.count('\n') >= lines:
break
else:
block -= 1
return data.splitlines()[-lines:]

Giampaolo Rodolà
- 12,488
- 6
- 68
- 60