0

I have a large parquet file that I can read into a pandas dataframe with read_parquet(). However, I wanted to process the file chunk by chunk and then create the processed dataframe. Is there any way I wan achieve this? read_csv with chunk size is not an option for my case. Thanks in advance.

jtlz2
  • 7,700
  • 9
  • 64
  • 114
Droid-Bird
  • 1,417
  • 5
  • 19
  • 43
  • This might help you - https://stackoverflow.com/questions/59098785/is-it-possible-to-read-parquet-files-in-chunks – YoungTim Nov 03 '21 at 16:56
  • Look at Iteration from fastparquet Python lib: https://fastparquet.readthedocs.io/en/latest/details.html#iteration – darked89 Nov 03 '21 at 22:29

0 Answers0