How to read parquet file and create chunk to process?

Asked Nov 03 '21 at 10:29

Active Apr 04 '22 at 19:08

Viewed 750 times

I have a large parquet file that I can read into a pandas dataframe with read_parquet(). However, I wanted to process the file chunk by chunk and then create the processed dataframe. Is there any way I wan achieve this? read_csv with chunk size is not an option for my case. Thanks in advance.

edited Apr 04 '22 at 19:08

jtlz2

7,700
9
64
114

asked Nov 03 '21 at 10:29

Droid-Bird

1,417
5
19
43

This might help you - https://stackoverflow.com/questions/59098785/is-it-possible-to-read-parquet-files-in-chunks – YoungTim Nov 03 '21 at 16:56
Look at Iteration from fastparquet Python lib: https://fastparquet.readthedocs.io/en/latest/details.html#iteration – darked89 Nov 03 '21 at 22:29

How to read parquet file and create chunk to process?

0 Answers0