4

According to Creating a simple XML file using python, one of the simplest ways to generate an XML file in Python is to use Python's built-in ElementTree XML API.

However, the Python 3 documentation includes the following warning:

Warning: The xml.etree.ElementTree module is not secure against maliciously constructed data. If you need to parse untrusted or unauthenticated data see XML vulnerabilities.

I had planned on using the ElementTree library to construct XML requests with user-inputted attribute values. However, I am now concerned about the security of my application.

For example, my application has a logon() function with arguments for a user-inputted username and password. These values are then used as XML attributes.

import xml.etree.ElementTree as ET

def logon(username, password):
    # Create XML logon request for external webservice
    root = ET.Element("xml")
    body = ET.SubElement(root, "Logon")
    body.set("Username", username)
    body.set("Password", password)

    return ET.tostring(root, encoding="UTF-8", method="xml")

Why is xml.etree.ElementTree considered insecure? Is it safe to use with user-defined XML attribute values?

Stevoisiak
  • 23,794
  • 27
  • 122
  • 225

1 Answers1

5

According to the section 20.4.1. XML vulnerabilities of the Python documentation, xml.etree.ElementTree is vulnerable to the Billion Laughs attack and to the quadratic blowup attack.

billion laughs / exponential entity expansion

The Billion Laughs attack – also known as exponential entity expansion – uses multiple levels of nested entities. Each entity refers to another entity several times, and the final entity definition contains a small string. The exponential expansion results in several gigabytes of text and consumes lots of memory and CPU time.

quadratic blowup entity expansion

A quadratic blowup attack is similar to a Billion Laughs attack; it abuses entity expansion, too. Instead of nested entities it repeats one large entity with a couple of thousand chars over and over again. The attack isn’t as efficient as the exponential case but it avoids triggering parser countermeasures that forbid deeply-nested entities.

As long as you don't parse maliciously crafted XML, you are safe.

Community
  • 1
  • 1
Ortomala Lokni
  • 56,620
  • 24
  • 188
  • 240