1

Over at this question there are some neat tricks for generating functions on the fly in Python.

In my use case, however, I need to make sure the generated function has a particular name and particularly-named arguments as well. I'll give an example.

Suppose I want to parse a Yaml file that has a format like this:

Root:
    my_idea:
        type: conditional
        conditionals:
            - foo
            - bar

        query: >
            SELECT * FROM some_table

This needs to be interpreted as: create a function for me called "my_idea" which has arguments named "foo" and "bar" (in that order).

When the function is called, use some existing tool to connect to a database and prepare the query element from the Yaml file. Then add a WHERE clause based on the conditional names matching the values passed in to the function for that argument.

So, after this happens, a user should be able to call:

my_idea(10, 20)

and it would be equivalent to executing the query

SELECT * FROM some_table WHERE foo = 10 AND bar = 20

If I used def to make the function, it might be something like this:

def my_idea(arg1, arg2):
    query = (query_retrieved_from_file + 
             " WHERE {}={} AND {}={}".format(arg1_name_from_file, 
                                             arg1, 
                                             arg2_name_from_file, 
                                             arg2))

    connection = ExistingLibraryConnectionMaker()
    return connection.execute(query).fetchall()

This is a really simplified example -- I'm not endorsing the specifics of this little function, just trying to illustrate the idea.

The question is: how to create this on-the-fly, where the name of the function and the name of the positional arguments are extracted from a text file?

In the other question, there is some example code:

import types
def create_function(name, args):

    def y(): pass

    y_code = types.CodeType(args,
                            y.func_code.co_nlocals,
                            y.func_code.co_stacksize,
                            y.func_code.co_flags,
                            y.func_code.co_code,
                            y.func_code.co_consts,
                            y.func_code.co_names,
                            y.func_code.co_varnames,
                            y.func_code.co_filename,
                            name,
                            y.func_code.co_firstlineno,
                            y.func_code.co_lnotab)

    return types.FunctionType(y_code, y.func_globals, name)

but it's not clear how to have the positional args reflect what I want them to semantically reflect.

The other solution I found was like this:

import types
import sys,imp

code = """def f(a,b,c):
    print a+b+c, "really WoW"
"""
module = imp.new_module('myfunctions')
exec code in module.__dict__
module.f('W', 'o', 'W')

Output:

WoW really WoW

This is much closer, but requires all of the code to be embedded in a string format. I'm looking to build up a function programmatically, and across a reasonably large set of options, so handling them all deep in strings is not doable.

Community
  • 1
  • 1
ely
  • 74,674
  • 34
  • 147
  • 228
  • you're creating a basic compiler – Jules G.M. Mar 20 '14 at 21:46
  • I would look into https://wiki.python.org/moin/LanguageParsing to find a good parser/translator library. – Jules G.M. Mar 20 '14 at 21:55
  • Does it have to be a function? Why not make a callable class – wim Mar 20 '14 at 21:58
  • That's a good point. As long as the arguments can be handled appropriately, it would be fine. `type` is easier than all of this `types.FunctionType` stuff... but I still can't see how I could get the names of the arguments. – ely Mar 20 '14 at 22:07

1 Answers1

1

LL Parser example, made for fun.

Generates

Generated Code:

def my_idea(arg0,arg1,atasdasd):
    query = "SELECT * FROM some_table WHERE foo==arg0 AND bar>arg1"
    connection = ExistingLibraryConnectionMaker()
    return connection.execute(query).fetchall()

def lool(hihi,ahah):
    query = "SELECT * FROM some_table WHERE foo<=hihi AND bar<ahah"
    connection = ExistingLibraryConnectionMaker()
    return connection.execute(query).fetchall()
###end

from

Root:
    my_idea:
        args :
            -arg0
            -arg1
            -atasdasd

        type: conditional
        
        conditional:
            -foo == arg0
            -bar > arg1

        query: 
            SELECT * FROM some_table

    lool:
        args :
            -hihi
            -ahah

        type: conditional
    
        conditional:
            - foo <= hihi
            - bar < ahah

        query: 
            SELECT * FROM some_table

Can handle any number of functions.

Code:

from __future__ import print_function
import re
import traceback
import sys

glIndex = 0
code = ""

class Token(object):
    def __init__(self, pattern, name, value=None, transform=None):

        self.pattern = pattern
        self.name = name
        self.value = value

tokens = {
    Token(r"(\()","lpar"), 
    Token(r"(\))","rpar"), 
    Token(r"(\d(?:\.\d*)?)","number"), 
    Token(r"(\+)", "plus"), 
    Token(r"(?!\-\-)(\-)","minus"), 
    Token(r"(?!\=\=)(\=)","egal"), 
    Token(r"(;)","colon"), 
    Token(r"([a-zA-Z][a-zA-Z0-9_]*)(?=[\s\:])","unique"), 
    Token(r"(\=)\=","=="), 
    Token(r"(\>\=)",">="), 
    Token(r"(\<\=)","<="), 
    Token(r"(?!\>\=)(\>)",">"), 
    Token(r"(?!\<\=)(\<)","<"), 
    Token(r"\:",":"),
    Token(r"\*","*")}

def peekComp(l):
    symbol = None
    if peekMatch(l,">=") :
        symbol = ">="
    elif peekMatch(l,"<=") :
        symbol = "<="
    elif peekMatch(l,">") :
        symbol = ">"
    elif peekMatch(l,"<") :
        symbol = "<"
    elif peekMatch(l,"==") :
        symbol = "=="
    return symbol

def parseListItem(l):
    match(l,"minus")
    u = match(l,"unique")
    return u



def parseTitle(l):
    val = match(l,"unique")
    match(l,":")
    return val

def parseComp(l):
    match(l,"minus")
    lvar = match(l,"unique")
    symbol = peekComp(l)
    if symbol == None:
        print("Homemaid SyntaxError: needed a comp symbol")
        exit(1)
    symbolS = match(l,symbol)


    rvar = match(l,"unique")
    return (lvar,symbolS,rvar)

def tokenize(s):
    l=[]
    i=0
    while i < s.__len__():
        if re.match(r"\s",s[i]):
            i+=1
            continue        
        foundAMatch = False
        for t in tokens:
            pat = "^(" + t.pattern + ").*"
            #print("trying with pat :'"+pat+"';")
            res = re.match(pat,s[i:])
            if res != None:
                print("Match: -what : '" + res.group(1) + "' -to-token-named :'" + t.name + "'; \t\tTotal text : '" + res.group(0) + "';" )
                i += res.group(1).__len__()
                foundAMatch = True
                l.append(Token(t.pattern,t.name,res.group(1)))
                break
        if not foundAMatch:
            print("Homemaid SyntaxError: No match for '" + s[i:] + "';")
            quit()
    return l

def syntaxError(l,fname):
    global glIndex
    print("Homemaid SyntaxError: '"+l[glIndex].name+"'")
    print(fname)
    quit()

def match(tokens, wanted):

    global glIndex

    if tokens[glIndex].name == wanted:
        glIndex+=1
        print("Matched '" + tokens[glIndex-1].value + "' as '" + wanted + "';")
        return tokens[glIndex-1].value
    else:
        print("Homemaid Syntax Error : Match failed on token '" + tokens[glIndex].name + "' with wanted token '" + wanted + "' and text '" + tokens[glIndex].value + "';")
        exit(1)

def peekMatch(token, wanted):
    global glIndex
    if glIndex < token.__len__() and token[glIndex].name == wanted:
        print("Matched "+wanted)
        return True
    else:
        return False
        
def parse(l):
    #root
    localCode = ""
    rootName = parseTitle(l)
    print("Root : " + rootName)
    #parse funcitons
    while peekMatch(l,"unique"):
        localCode += parseFunction(l)
                
    print("Done with the parsing.")
    return localCode

def parseFunction(l):
    print("CAME IN PARSE FUNCITON")
    #function name
    localCode = "\n\ndef " + parseTitle(l) +"(";

    #args
    args = set()
    title = parseTitle(l)
    if title!="args":
        print("Homemaid Syntax Error : title should be 'args', was instead '" + title + "';")
        exit(1)

    while(peekMatch(l,"minus")):
        lastArg = parseListItem(l)
        args.add(lastArg)
        localCode += lastArg
        if peekMatch(l,"minus") :
            localCode += ","
    localCode += "):\n"
    #type
    if parseTitle(l)!="type":
        print("Homemaid Syntax Error : title should be 'type'")
        exit(1)

    #query
    ##query name
    queryTypeName = match(l, "unique")

    ##query args
    queryTypeArgs = []
    if parseTitle(l)!=queryTypeName:
        print("Homemaid Syntax Error : title should be the same as the name of the query.")
        exit(1)

    while(peekMatch(l,"minus")):
        queryTypeArgs.append(parseComp(l))

    ##query sql code
    if parseTitle(l) != "query":
        print("Homemaid Syntax Error : title should be 'query'.")
        exit(1)
    initialQuery = parseBasicSqlQuery(l)
    if queryTypeName == "conditional" and queryTypeArgs.__len__() <= 0 : 
        print("Homemaid Syntax error : Conditional query needs at least one arg.")
        exit(1)

    ##query codegen
    localCode += "\tquery = \"" + initialQuery + " WHERE "
    first = True
    if queryTypeName == "conditional":
        for lArg, cmpSign, rArg in queryTypeArgs:
            if not first:
                localCode += " AND "
            if rArg in args:
                first = False
                localCode += lArg + cmpSign + rArg
            else:
                print("queryTypeArgs : " + str(queryTypeArgs))
                print("Homemaid Logic Error: Query arg '" + rArg + "' is not in the funciton args " + str(args) + ".")
                quit(1)

    localCode += "\"\n\tconnection = ExistingLibraryConnectionMaker()\n\treturn connection.execute(query).fetchall()"
    return localCode

def parseBasicSqlQuery(l):
    selectS = match(l,"unique")
    whatS = match(l,"*")
    fromS = match(l,"unique")
    tableNameS = match(l,"unique")
    if selectS.upper() != "SELECT" or fromS.upper() != "FROM":
        print("Homemaid Syntax error: bad basic sql.")
        exit(0)
    return selectS + " " + whatS + " " + fromS + " " + tableNameS

def parseVal(l):
    if match(l, "lpar"):
        parseVal(l)
        match(l, "rpar")
    elif peekMatch(l, "number") and (peekMatch(l, "plus") or peekMatch(l, "minus") or peekMatch(l, "equal")):
        glIndex+=1
        print("peekMatched!")
        parseOp(l)
        parseVal(l)
    elif match(l, "number"):
        pass
    else:
        syntaxError(l, "parseVal")
    print("** Parsed val.")

def parseOp(l):
    if match(l, "plus"):
        pass
    elif match(l, "minus"):
        pass
    elif match(l, "egal"):
        pass
    else:
        syntaxError(l, "parseVal")
    print("** Parsed op.")


if __name__ == "__main__":
    with open("sqlGenTest.SQLGEN", "rw") as file:
        print("File:\n'")
        text = file.read() 
        print(text + "'\n")
        tokens = tokenize(text)
        names = map(lambda x: str("'" + x.name + "'" + " : " + "'" + x.value + "'"), tokens)
        map(print,names)
        code = parse(tokens)
        print("")

        print("###Generated Code:\n" + code)
    print("###end")
    print()
    
Community
  • 1
  • 1
Jules G.M.
  • 3,624
  • 1
  • 21
  • 35
  • I'm not sure about this. My goal is to be able to cause a function to exist at any point in the code. I may be in the body of a class (so there should be a facility to tell it that `self` needs to be the first argument) or I may be arbitrarily far down in some nested conditional blocks. Where ever I am as I'm writing code, I want some piece of code that I can invoke to establish into existence one of these created functions *at that place* in the code. I don't want something that generates static code and inserts it into a file containing code. – ely Mar 21 '14 at 13:52
  • I understand. You could always just eval the generated code. The more likely option would be to create an IR from the parsed code, eg create a tree of functionality types, and go through it accordingly. I don't have time right now to code that for fun though. – Jules G.M. Mar 21 '14 at 18:18