-3

How can I make this code safer? It is a minimal reproducible example of a more complex code where the internal users are allowed read access to a few dictionaries in the code, whose names are known in advance. The example works as intended with eval, and prevents some malicious user input, such as a system call to rm -rf /. But I was looking for a safer method than eval.

import re

# In the minimal example, I have 2 dicts that the users need 
# read access to, and keys match this regex: ^\w+$
dct_a = {'foo': 1, 'bar': 2}
dct_b = {'baz': 3, 'bletch': 4}

# User input, e.g.:
lst = ["dct_a['foo']", "dct_b['baz']"]

for item in lst:
    # Make safer, prevent a few obvious hacks:
    if not re.findall(r"^[\w\]\[']+$", item):
        raise Exception(f'Unsafe item: {item}')
    # do something with item, e.g.:
    print(eval(item))

# Prints:
# 1
# 3

I am aware that eval is dangerous, no need to repeat the warnings.

RELATED:
Python: make eval safe

Timur Shtatland
  • 12,024
  • 2
  • 30
  • 47
  • Also related: https://stackoverflow.com/q/3068139/3001761 – jonrsharpe Mar 10 '21 at 15:23
  • 1
    If it's only about accessing keys in defined dicts, you could create your own mini-language which only allows dict key access. E.g. `d, k = 'dct_a.foo'.split('.'); {'dct_a': dct_a, ...}[d][k]`… – deceze Mar 10 '21 at 15:24
  • 2
    I think it would be better to alter the requirements of the program so `eval` isn't needed. Blacklists just lead to "arms races" with attackers. A simple parser that does specifically what you need would be cleaner in the long run, if your problem can't be reduced down to simply accessing dictionaries normally. – Carcigenicate Mar 10 '21 at 15:24
  • @deceze: In the simplest case, this is all that's needed. If you expand the comment into an answer, I will be happy to accept it! – Timur Shtatland Mar 10 '21 at 16:04
  • @Carcigenicate: That's my question precisely, how to replace the `eval` with the (hopefully not too complex) parser (or anything else safer than `eval`)? – Timur Shtatland Mar 10 '21 at 16:10

2 Answers2

2

Instead of starting with the full power of Python and trying to rein it in (which is a fool's errand), start with only what you need, which doesn't allow the user to do anything else because it can't do anything else. You don't want your user to be able to run Python code, you want your user to be able to specify "variable values" in some very limited fashion. So, for example:

vals = {
    'a': {'foo': 1, 'bar': 2},
    'b': {'baz': 3, 'bletch': 4}
}

# User input, e.g.:
lst = ["a.foo", "b.baz"]

for item in lst:
    dct, k = item.split('.')
    print(vals[dct][k])

If that's all your users need to be able to do, that's all you need. From here you can of course start to write your own mini-language for more and more powerful, yet still very limited and restricted expressions. There are libraries like pyparsing for this purpose.

deceze
  • 510,633
  • 85
  • 743
  • 889
1

To access variables (#safely) you can use vars() function
And regex to add more functionality to it:

import re

dct_a = {'foo': 1, 'bar': 2}
dct_b = {'baz': 3, 'bletch': 4}

for _ in range(5):
    try:
        cmd = input(">>> ")
        cmds = re.findall(r'(\w+)\[\'?\"?(\w+)\'?\"?\]',cmd)
        if cmds:
            print(vars()[cmds[0][0]][cmds[0][1]])
        else:
            print(vars()[cmd])
    except Exception as e:
        print(cmd,'Not in Scope')

vars() returns a dictionary, of all the variable-names as keys and their value as values ,in that scope. In this case as I have used vars without any arguments and so it is equivalent to locals(), and can be used instead of vars as well.

globals() is a similar method, but as the name suggests, it access to only global variables.

Output:

>>> dct_a
{'foo': 1, 'bar': 2}

>>> dct_a['bar']
2

>>> dct_b
{'baz': 3, 'bletch': 4}

>>> dct_b["baz"]
3

>>> dct_c
dct_c Not in Scope
Rishabh Kumar
  • 2,342
  • 3
  • 13
  • 23