0

I want to read the csv file in a manner similar to tail -f i.e. like reading an error log file.

I can perform this operation in a text file with this code:

 while 1:
      where = self.file.tell()
      line = self.file.readline()
      if not line:
        print "No line waiting, waiting for one second"
        time.sleep(1)
        self.file.seek(where)
      if (re.search('[a-zA-Z]', line) == False):
        continue
      else:
        response = self.naturalLanguageProcessing(line)
        if(response is not None):
          response["id"] = self.id
          self.id += 1
          response["tweet"] = line
          self.saveResults(response)
        else:
          continue

How do I perform the same task for a csv file? I have gone through a link which can give me last 8 rows but that is not what I require. The csv file will be getting updated simultaneously and I need to get the newly appended rows.

1 Answers1

1

Connecting A File Tailer To A csv.reader

In order to plug your code that looks for content newly appended to a file into a csv.reader, you need to put it into the form of an iterator.

I'm not intending to showcase correct code, but specifically to show how to adopt your existing code into this form, without making assertions about its correctness. In particular, the sleep() would be better replaced with a mechanism such as inotify to let the operating system assertively inform you when the file has changed; and the seek() and tell() would be better replaced with storing partial lines in memory rather than backing up and rereading them from the beginning over and over.

import csv
import time

class FileTailer(object):
    def __init__(self, file, delay=0.1):
        self.file = file
        self.delay = delay
    def __iter__(self):
        while True:
            where = self.file.tell()
            line = self.file.readline()
            if line and line.endswith('\n'): # only emit full lines
                yield line
            else:                            # for a partial line, pause and back up
                time.sleep(self.delay)       # ...not actually a recommended approach.
                self.file.seek(where)

csv_reader = csv.reader(FileTailer(open('myfile.csv')))
for row in csv_reader:
    print("Read row: %r" % (row,))

If you create an empty myfile.csv, start python csvtailer.py, and then echo "first,line" >>myfile.csv from a different window, you'll see the output of Read row: ['first', 'line'] immediately appear.


Finding A Correct File Tailer In Python

For a correctly-implemented iterator that waits for new lines to be available, consider referring to one of the existing StackOverflow questions on the topic:

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441