I'm working with a huge .7z file that I need to process line by line.
First I tried py7zr
, but it only works by first decompressing the whole file into an object. This runs out of memory.
Then libarchive
is able to read block by block, but there's no straightforward way of splitting these binary blocks into lines.
What can I do?
Related questions I researched first:
- How to read contents of 7z file using python: The answers only decompress the whole file.
- How to read from a text file compressed with 7z?: Seeks Python 2.7 answers.
- Python: How can I read a line from a compressed 7z file in Python?: Focuses on a single line, no accepted answer - only answer posted 7 years ago.
I'm looking for ways to improve the temporary solution I built myself - posted as an answer here. Thanks!