4

The Billion Laughs DoS attack seems preventable by simply stopping entities in XML files from being expanded. Is there a way to do this in Python's xlrd library (i.e. a flag of some sort)? If not, is there a recommended way to avoid the attack?

Cisplatin
  • 2,860
  • 3
  • 36
  • 56
  • 1
    I haven't tried, but what happens if you try and load the example XML file (from the in the wikipedia article) with xlrd? Does it handle it? –  Feb 18 '16 at 02:01
  • 3
    [relevant link](https://pypi.python.org/pypi/defusedxml/) – roippi Feb 18 '16 at 02:16
  • 1
    @LegoStormtroopr It stalls for a very, very long time. I haven't waited it out all the way but I'd expect it eventually results in an overflow. – Cisplatin Feb 18 '16 at 02:27

1 Answers1

1

Not with xlrd by itself

There is no option in xlrd at this time for preventing any sort of XML bomb. In the source code, the xlsx data is passed to python's built-in xml.etree for parsing without any validation:

import xml.etree.ElementTree as ET

def process_stream(self, stream, heading=None):
        if self.verbosity >= 2 and heading is not None:
            fprintf(self.logfile, "\n=== %s ===\n", heading)
        self.tree = ET.parse(stream)

However, it may be possible to patch ElementTree using defusedxml

As noted in the comments, defusedxml is a package targeted directly at the problem of security against different types of XML bombs. From the docs:

Instead of:

from xml.etree.ElementTree import parse
et = parse(xmlfile)

alter code to:

from defusedxml.ElementTree import parse
et = parse(xmlfile)

It also provides the functionality of patching the standard library. Since that is what xlrd is using, you are able to use the combination of xlrd and defusedxml to read Excel files while protecting yourself from XML bombs.

Additionally the package has an untested function to monkey patch all stdlib modules with defusedxml.defuse_stdlib().

Community
  • 1
  • 1
Darrick Herwehe
  • 3,553
  • 1
  • 21
  • 30
  • I tried to use defusedxml and it does not handle xml bomb at all. maybe I am not using it well, but I tested its basic functionality and it only blocked xml containing entities, DTD etc. there is another python package called defusedexpat which tend to handle xml bomb, but unfortunately also didn't work well. – yaniv israel Jun 14 '18 at 14:40