For example, pandas's read_csv
has a chunk_size
argument which allows the read_csv
to return an iterator on the CSV file so we can read it in chunks.
The Parquet format stores the data in chunks, but there isn't a documented way to read in it chunks like read_csv
.
Is there a way to read parquet files in chunks?