1

I have a growing.csv file that looks like this:

...
20211213 20:49:01,61826.0,61925.0,61928.0,1014.41
20211213 20:50:01,61839.0,62122.0,61928.0,1014.41
20211213 20:51:01,61901.0,62026.0,62035.0,1015.03
...

But I'd like to keep this file to the latest, say ~10,000 lines/rows, as the last ~7,500 rows are used by the program. Is there perhaps a smart way to do this?

Holger Just
  • 52,918
  • 14
  • 115
  • 123

1 Answers1

0

You can get the lowest 10000 lines of a file with the tail command on your command line, e.g.

tail -n 10000 growing.csv > truncated.csv

If you have headers or top-level comments you would like to keep, you can similarly use the head command to get those too:

# write the first line of growing.csv to truncated.csv
# truncated.csv will be overwritten
head -n1 growing.csv > truncated.csv

# Now append the lowest 10k lines to the newly
# created/truncated file
tail -n 10000 growing.csv >> truncated.csv
Holger Just
  • 52,918
  • 14
  • 115
  • 123
  • Thanks Holger. That sound simple enough. Should this be set as n maintenance cycle. I don't think that this should run every minute. –  Dec 13 '21 at 19:22
  • That depends on when, how, and how much you add data to the file. Depending on how you actually generate the data, there might even be more appropriate ways to reduce the size. – Holger Just Dec 13 '21 at 19:41