48

Are there any known ways for ast.literal_eval(node_or_string)'s evaluation to not actually be safe?

If yes, are patches available for them?

(I already know about PyPy[sandbox], which is presumably more secure, but unless the answers are yes then no, my needs are minor enough that I won't be going that far.)

2 Answers2

91

The documentation states it is safe, and there is no bug relative to security of literal_eval in the bug tracker, so you can probably assume it is safe.

Also, according to the source, literal_eval parses the string to a python AST (source tree), and returns only if it is a literal. The code is never executed, only parsed, so there is no reason to be a security risk.

madjar
  • 12,691
  • 2
  • 44
  • 52
  • 8
    +1 The reason there aren't more answers here is that nothing more needs to be said. – Petr Viktorin Oct 09 '11 at 18:52
  • 1
    Well, it's always difficult to prove that there is no risk, but the fact the code is never actually executed should help to convince that there is not much risk. – madjar Oct 10 '11 at 10:58
  • 2
    The risk is about the same as using Python itself. – Petr Viktorin Oct 10 '11 at 11:49
  • unfortunately, i would like to use `ast.literal_eval()` in order to filter an input before passing it to `eval()` or `exec()`, which always represents a risk. but in fact, the source code seems to show that the input is pretty strictly filtered. i just hope that i did not miss an edge-case... – Adrien Plisson Oct 11 '11 at 12:51
  • 11
    If the input is a litteral, literal_eval() will return the value. If the input is more than a literal (it contains code), then literal_eval() will fail, and there would be a risk in executing the code. In both case, literal_eval() does the job. Why do you want to use eval() or exec() after that ? – madjar Oct 11 '11 at 12:57
  • @madjar: oops, sorry, no i do not call `exec()` or `eval()` anymore. i was before, until i discovered `ast.literal_eval()`, it just messed up in my brain... – Adrien Plisson Oct 11 '11 at 16:30
14
>>> code = '()' * 1000000
>>> ast.literal_eval(code)
[1]    3061 segmentation fault (core dumped)  python2

or possibly smaller will crash with SIGSEGV in Python 2. It might be exploitable under some conditions. This particular bug has been fixed in Python 3, but bugs may still exist in the AST parser.

  • you are using a operation in arguments to`literal_eval` (which is not a string or node), and has nothing to do with `literal_eval`. – Prodipta Ghosh Feb 19 '19 at 10:21
  • 3
    @ProdiptaGhosh it is a string. There is a very good reason why I didn't expand those **million** parentheses in this answer! – Antti Haapala -- Слава Україні Feb 19 '19 at 10:31
  • The point is, you are first evaluating an expression (the string multiplied a gazillion time, it is an expression, **not** a string) **before** you are calling `literal_eval`, and that string expansion has **nothing** to do with `literal_eval` whatsoever. If things go write it gets the expanded string. If it goes wrong, python crashes even before `literal_eval` is called. – Prodipta Ghosh Feb 19 '19 at 10:49
  • 1
    Ok, this makes things much clear. This seems a valid point. Not much to do with `literal_eval` but to the underlying `parse` and then the `compile` call, which segfault on exceeding max recursion limit. This is a valid point. I have reversed my vote. This seems to be an [open issue](https://bugs.python.org/issue32758) for later versions as well – Prodipta Ghosh Feb 19 '19 at 11:33
  • Seems a fair point but assuming you're using the `literla_eval(data)` function can't just put a `if len(data) < 10000:` just before to avoid this issue ? – Welgriv Jan 20 '21 at 10:23
  • In 3.8, I find that even a small input can cause `MemoryError` (*not* `RecursionError`). The limit appears to be 100 nested parentheses; I assume it is limited explicitly and specially. – Karl Knechtel Sep 07 '22 at 03:35