How to resolve the error related to frame used in zstandard which requires too much memory for decoding

Question

To download the data related to questions and answers, I am following the script on facebook/ELI5.

There it says to run the command: python download_reddit_qalist.py -Q. On running this command, I get an error on line number 70 in python file 'download_reddit_qalist.py', where the zstandardDecompressor object is enumerated. The error log says that:

zstd.ZstdError: Zstd decompress error: Frame requires too much memory for decoding

Thinking the memory issue, I allocated 32 gb memory to the container along with 8 CPUs. But the error stays.

When I replaced the enumerate function with ElementTree.iterparse(), then along with this error, another message adds up:

for i, l in ET.iterparse(f):

File "/anaconda3/lib/python3.8/xml/etree/ElementTree.py", line 1229, in iterator

data = source.read(100 * 2048)

zstd.ZstdError: zstd decompress error: Frame requires too much memory for decoding

Does anyone face the similar error? I have the docker container running on the slurm cluster. If you need more information let me know.

Just because you got the code from a Facebook page, does not mean this question is worth tagging `facebook` (removed.) — CBroe, Sep 21 '21 at 15:04

score 7 · Accepted Answer · answered Sep 26 '21 at 08:13

7

zstdDecompressor(max_window_size=2147483648)

In future, if anyone faces this error, then above is the way to correct it.

in the file download_reddit_qalist.py, on line 66, one can change.

answered Sep 26 '21 at 08:13

akshit bhatia

573
6
22

How to resolve the error related to frame used in zstandard which requires too much memory for decoding

1 Answers1