1

I have 4gb csv file and i need to extract only the rows with a specific date(ex. 31.12.2020) from column 'Date'. I would like not to import entire file on jupyter but directly import specific rows from local to jupyter. It is possible? Thank you

Domec
  • 87
  • 1
  • 8
  • Did you consider using a different format like hdf5 instead? – fr_andres Mar 03 '21 at 21:52
  • It may be possible but I wouldn't do this. Use a command line program like grep to get what you need into another smaller file. You can execute bash commands in Jupyter with %%bash in the cell. – wbg Mar 03 '21 at 21:56
  • A `csv` is a text file. `numpy` and `pandas` loads are designed to read the whole thing, though you can also specify line ranges (start, number etc). For a line with a specific value, you'd have to read the file, line by line, until you find the desired line. In other words, such a file doesn't have an index that would quickly tell you which line to load. – hpaulj Mar 03 '21 at 21:57
  • There's a method here for loading with chunks and concatenating. You could modify it to your needs. https://stackoverflow.com/questions/13651117/how-can-i-filter-lines-on-load-in-pandas-read-csv-function – wbg Mar 03 '21 at 22:09

0 Answers0