4

To download the data related to questions and answers, I am following the script on facebook/ELI5.

There it says to run the command: python download_reddit_qalist.py -Q. On running this command, I get an error on line number 70 in python file 'download_reddit_qalist.py', where the zstandardDecompressor object is enumerated. The error log says that:

zstd.ZstdError: Zstd decompress error: Frame requires too much memory for decoding

Thinking the memory issue, I allocated 32 gb memory to the container along with 8 CPUs. But the error stays.

When I replaced the enumerate function with ElementTree.iterparse(), then along with this error, another message adds up:

for i, l in ET.iterparse(f):

File "/anaconda3/lib/python3.8/xml/etree/ElementTree.py", line 1229, in iterator

data = source.read(100 * 2048)

zstd.ZstdError: zstd decompress error: Frame requires too much memory for decoding

Does anyone face the similar error? I have the docker container running on the slurm cluster. If you need more information let me know.

CBroe
  • 91,630
  • 14
  • 92
  • 150
akshit bhatia
  • 573
  • 6
  • 22
  • 1
    Just because you got the code from a Facebook page, does not mean this question is worth tagging `facebook` (removed.) – CBroe Sep 21 '21 at 15:04

1 Answers1

7

zstdDecompressor(max_window_size=2147483648)

In future, if anyone faces this error, then above is the way to correct it.

in the file download_reddit_qalist.py, on line 66, one can change.

akshit bhatia
  • 573
  • 6
  • 22