I am working on an interface for a simulator that is meant to be friendly to people who prefer the command line to a GUI. To give the simulator the levels, the user types the information into a file, which is then parsed and generates design points, which are then sent to a main server.
I would like to be able to implement some sort of "range" feature so that the user will not need to type out all of the individual levels. More power is needed than a simple additive sequence. Since the parser and related code is already in Python, this seems like a perfect use case for list comprehensions. However, the list comprehension is user input and not guaranteed to be valid. Using eval
seems too dangerous, and literal_eval
does not support list comprehensions.
My current goal is for something like this to be valid and safe:
{"Factor 1": [1,2,3,7,8],
"Factor 2": "[2**x for x in range(5,20) if (x % 3) == 0]"}
The base format for files that the user types is JSON. I am looking to extend the language to have additional features (like range
) to fill various user needs. "Data set 1" can be parsed in the existing system. The list comprehension will be evaluated on the user's machine, so simple attacks like 'x'*9**999999**99999
are self-destructive.
It seems relatively easy to sanitize the range
function using a regex, but I'm not sure how to make sure that the other parts are safe. Are regexes sufficient for this task, or is there another approach I should be following?