How can I chop a large and growing CSV file in python

Question

I have a growing.csv file that looks like this:

...
20211213 20:49:01,61826.0,61925.0,61928.0,1014.41
20211213 20:50:01,61839.0,62122.0,61928.0,1014.41
20211213 20:51:01,61901.0,62026.0,62035.0,1015.03
...

But I'd like to keep this file to the latest, say ~10,000 lines/rows, as the last ~7,500 rows are used by the program. Is there perhaps a smart way to do this?

Hahaha.... Sory I forgot how these editors work. – Dec 13 '21 at 19:16 — , Dec 13 '21 at 19:16

score 0 · Answer 1 · answered Dec 13 '21 at 19:15

0

You can get the lowest 10000 lines of a file with the tail command on your command line, e.g.

tail -n 10000 growing.csv > truncated.csv

If you have headers or top-level comments you would like to keep, you can similarly use the head command to get those too:

# write the first line of growing.csv to truncated.csv
# truncated.csv will be overwritten
head -n1 growing.csv > truncated.csv

# Now append the lowest 10k lines to the newly
# created/truncated file
tail -n 10000 growing.csv >> truncated.csv

answered Dec 13 '21 at 19:15

Holger Just

52,918
14
115
123

Thanks Holger. That sound simple enough. Should this be set as n maintenance cycle. I don't think that this should run every minute. – Dec 13 '21 at 19:22
That depends on when, how, and how much you add data to the file. Depending on how you actually generate the data, there might even be more appropriate ways to reduce the size. – Holger Just Dec 13 '21 at 19:41

How can I chop a large and growing CSV file in python

1 Answers1