8

I realise this question has been asked before, however this case is slightly different.

I want to run a python imageboard (using web.py), that will allow users to generate new images by submitting code. The code will be of the form of a single function that takes the x,y coordinates of a pixel and returns the r,g,b values, eg:

def simpleGradient(xrel,yrel):
    r = xrel*256
    g = yrel*256
    b = 0
    return [r,g,b]

Only a very small syntax is required, and it doesn't necessarily have to be python. Using exec with limited scope seems to be too insecure, and using PyPy or a VM seems unnecessarily complex (I'm quite new to all this).

Rather than sandboxing it, is there a pythonic way to execute the code in a much smaller language? Either a subset of python (parsing and whitelisting?), or a math oriented language that I can embed?

SudoNhim
  • 580
  • 4
  • 17
  • 2
    I actually would use a PyPy sandbox. – Jonathan Leonard May 18 '12 at 05:51
  • Several other answers I read voted against it... So I haven't really looked into PyPy - I'll check it out thanks – SudoNhim May 18 '12 at 05:54
  • Great question, maybe PyPy is the answer. Was just talking today about how Python might be kinda' short here, compared to say lua. – Skylar Saveland May 18 '12 at 06:06
  • 1
    If you have the time, I think it would be fun to roll your own using python's internal compiler: http://stackoverflow.com/questions/594266/equation-parsing-in-python – arifwn May 18 '12 at 06:08
  • 1
    Wow... I was considering constructing my own language (currently writing a PL0 compiler for a uni assignment), but this way could be a lot more fun! – SudoNhim May 18 '12 at 06:13

2 Answers2

3

This is the solution I went with. For a discussion of the security of this approach, see

Thanks to arifwn, I got into exploring Python's ast (abstract syntax tree) module. This module provides a class ast.NodeVisitor for traversing the tree. This code subclasses NodeVisitor to create a syntax checker that whitelists the code necessary for basic math. Function calls and names are specially monitored, as only certain functions should be allowed and only unused names should be permitted.

import ast

allowed_functions = set([
    #math library
    'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh',
    'ceil', 'copysign', 'cos', 'cosh', 'degrees', 'e', 'erf',
    'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod',
    'frexp', 'fsum', 'gamma', 'hypot', 'isinf', 'isnan', 'ldexp',
    'lgamma', 'log', 'log10', 'log1p', 'modf', 'pi', 'pow', 'radians',
    'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'trunc',
    #builtins
    'abs', 'max', 'min', 'range', 'xrange'
    ])

allowed_node_types = set([
    #Meta
    'Module', 'Assign', 'Expr',
    #Control
    'For', 'If', 'Else',
    #Data
    'Store', 'Load', 'AugAssign', 'Subscript',
    #Datatypes
    'Num', 'Tuple', 'List',
    #Operations
    'BinOp', 'Add', 'Sub', 'Mult', 'Div', 'Mod', 'Compare'
    ])

safe_names = set([
    'True', 'False', 'None'
    ])


class SyntaxChecker(ast.NodeVisitor):

    def check(self, syntax):
        tree = ast.parse(syntax)
        self.passed=True
        self.visit(tree)

    def visit_Call(self, node):
        if node.func.id not in allowed_functions:
            raise SyntaxError("%s is not an allowed function!"%node.func.id)
        else:
            ast.NodeVisitor.generic_visit(self, node)

    def visit_Name(self, node):
        try:
            eval(node.id)
        except NameError:
            ast.NodeVisitor.generic_visit(self, node)
        else:
            if node.id not in safe_names and node.id not in allowed_functions:
                raise SyntaxError("%s is a reserved name!"%node.id)
            else:
                ast.NodeVisitor.generic_visit(self, node)

    def generic_visit(self, node):
        if type(node).__name__ not in allowed_node_types:
            raise SyntaxError("%s is not allowed!"%type(node).__name__)
        else:
            ast.NodeVisitor.generic_visit(self, node)

if __name__ == '__main__':
    x = SyntaxChecker()
    while True:
        try:
            x.check(raw_input())
        except Exception as e:
            print e

Note that this is designed to accept only the mathematical part of the code, the function definition and return statement are provided.

This method of whitelisting all the required safe constructs and specifically whitelisting required unsafe constructs, could be modified to produce many useful subsets of Python; excellent for user scripts!

Note that in order for this to be executed securely, it should be in it's own thread with a timeout, to reduce name collisions and time out if the user code generates an infinite loop or similar.

Community
  • 1
  • 1
SudoNhim
  • 580
  • 4
  • 17
  • This should be a standalone question IMO. – TryPyPy May 18 '12 at 20:30
  • Sorry. I know someone who can hopefully check it over for me; if he finds it OK I'll reformat it to be more like an answer. (Else I'll delete it). – SudoNhim May 18 '12 at 22:56
  • What I meant was: you'll get more eyes (and probably new suggestions) if you convert this answer into a new question :) – TryPyPy May 18 '12 at 22:58
  • Moved to http://stackoverflow.com/questions/10661079/restricting-pythons-syntax-to-execute-user-code-safely-is-this-a-safe-approach – SudoNhim May 18 '12 at 23:54
1

There is a lot of great information on the pysandbox pypi page.

Skylar Saveland
  • 11,116
  • 9
  • 75
  • 91