6

I store thousands of time series in .csv files on a network drive. Before I update the files, I first get the last line of the file to see the timestamp and then I update with data after that timestamp. How can I quickly get the last line of a .csv file over a network drive so that I don't have to load the entire huge .csv file only to use the last line?

user1367204
  • 4,549
  • 10
  • 49
  • 78

1 Answers1

6

There is a nifty reversed tool for this, assuming you are using the built-in csv module:

how to read a csv file in reverse order in python

In short:

import csv
with open('some_file.csv', 'r') as f:
    for row in reversed(list(csv.reader(f))):
        print(', '.join(row))

In my test file of:

1:   test, 1
2:   test, 2
3:   test, 3

This outputs:

test, 3
test, 2
test, 1
Douglas
  • 724
  • 7
  • 11
  • 1
    I feel like this still will read the entire file into memory, while what I need is just to read the last row into memory so that it doesn't waste time reading unnecessary stuff into memory. – user1367204 Jun 21 '17 at 17:25
  • 1
    You could try being more explicit, without iterating over each row by indexing: `print(', '.join(reversed(list(csv.reader(f))[-1])))` I am sure there is a better way to do it, but that is what comes to mind... – Douglas Jun 21 '17 at 17:29
  • I just tested that and it was the same amount of time =( – user1367204 Jun 21 '17 at 17:37
  • There was another thread you may want to check out... haven't tried it, but it might be worth while... uses `seek`: https://stackoverflow.com/questions/260273/most-efficient-way-to-search-the-last-x-lines-of-a-file-in-python – Douglas Jun 21 '17 at 17:40