1

I'd like to use python regex to add a parameter to all calls to a function in a c file:

ret = func(a) should be replaced with: ret = func(a, newparam)

I also need to be able to handle a case like this: ret = func(func2(b)) should be replaced with: ret = func(func2(b), newparam)

Thanks in advance

simont
  • 68,704
  • 18
  • 117
  • 136
user1247196
  • 191
  • 11
  • why not `func(func2(b,newparam))` or `func(func2(b,newparam), newparam)` Do you want it only in "func"? – jerrymouse Mar 03 '12 at 20:43
  • 5
    No regex can do that, at least not if you want it soon, truly reliable and without making the author go insane. There's a reason *real* refactoring tools work with ASTs. –  Mar 03 '12 at 20:43
  • @NiklasB. "regex" [are not regular](http://stackoverflow.com/questions/7434272/match-an-bn-cn-e-g-aaabbbccc-using-regular-expressions-pcre). – Qtax Mar 03 '12 at 20:46
  • @Qtax: That's why I initially thought and commented that it would at least be possible to do it with regexes. However, it's *not* in Python (which does not support recursive matching). So in fact, Python regexes *are* regular. – Niklas B. Mar 03 '12 at 20:48
  • 2
    @NiklasB., you don't need recursion. "Regex" have not been regular at least since backreferences have been introduced (and they've been there forever). – Qtax Mar 03 '12 at 20:50
  • 1
    By the way, there already exists an [C parser for Python](http://code.google.com/p/pycparser/). – Niklas B. Mar 03 '12 at 20:50
  • 3
    Even if some "regex" implementations *can* do it in theory (there probably are abominations that do), it doesn't matter. You need a really clever bastard to come up with a regex that's anything near correct, but it won't be maintainable, it's a horrible waste of time, probably needs to be re-done for another full week to work for all intended use cases, etc. There are tools for that. It's a solved problem, and the solution does not involve regexes (except perhaps in the lexer). Any regex for it would be an academic exercise (and insanity to boot). And what's your point with the C parser? –  Mar 03 '12 at 20:51
  • @Qtax: My mistake. Still, I don't think it would be possible to match a complex grammar like C with Python regular expressions. And if it was possible, it surely wouldn't be desireable – Niklas B. Mar 03 '12 at 20:52
  • @NiklasB., I don't know the details of C grammar, but I Pythons regex could probably do it. Not that it's a good idea or anything. ;) [Here is a solution to part of the previously linked question without use of recursion](http://stackoverflow.com/questions/3644266/how-can-we-match-an-bn-with-java-regex), which should work in Python too. – Qtax Mar 03 '12 at 20:55
  • @Qtax: Nice work by that guy :) I can't imagine how complex it would be to match a "real" context-free language that way... Well, at least it seems to be possible. – Niklas B. Mar 03 '12 at 21:03
  • @Qtax: Does `re` support atomic grouping or possessive quantifiers ? It seems they are required for the solution you've linked. ([`regex` module](http://pypi.python.org/pypi/regex) supports it). – jfs Mar 03 '12 at 21:24
  • You could possibly get by with a [decorator](http://wiki.python.org/moin/PythonDecorators)... – Mark Rushakoff Mar 03 '12 at 21:52
  • @J.F.Sebastian, atomic grouping is not required for polygenelubricants solution that I linked to (afaics). A quick look at `re` docs seems to indicated that it does not have direct support such features. Note that you could practically get an atomic group using expressions like `(?=(foo+))\1`. – Qtax Mar 04 '12 at 02:37

2 Answers2

0

Regular expressions are a language inside a language - So, while it would be certainly possible to do what you want with a regular expression, it is not worth it. If for nothing else, you loose readability and maintainability, besides having to think a lot more just to get it working in first placing.

It can safely be donei n Python, without resort to the use of regex, in a way you have full control of where your new parameter is added.

The code bellow can do that, provided no calls to func are broken in more than one line of source code:

add_parm_func = "func"
parm_to_add = "newparam"

def change(line):
    start = line.find(add_parm_func) + len(add_parm_func)
    res = line[:start]
    open_paren = 0
    for i, chr in enumerate(line[start:])
        if chr == "(":
            open_paren += 1
        elif chr == ")":
            open_paren -= 1
            if open_paren == 0:
                res += ", %s )" % parm_to_add
                break
        res += chr
    res += line[start + i:]
    return res

with open("sourcefile.c") as src, open("destfile.c") as dst:
    for line in src:
        if add_parm_func in line:
            line = change(line)

        dst.write(line)
jsbueno
  • 99,910
  • 10
  • 151
  • 209
  • 1
    This fails on string literals and comments (among others): `smile(":-(" /* ( */)` - there is no good reason to try and parse C with a simple regex, but the same can be say for this snippet... – Kobi Mar 03 '12 at 22:44
  • However, the matter here is not to provide an universal C code updates, just a script that would fill the OP' needs. If those needs do include skipping literal strings (the question don't mention them), adding state to keep track of that would be trivial - more changes would be needed for line break support - which are also let aside. – jsbueno Mar 07 '12 at 14:19
0

Why dont you refactor your code to rename func to func_new to do something like:

def func_new(foo):
    return func(foo, newparam)

Make newparam globally available. Its far cleaner and doesnt need much efforts.

jerrymouse
  • 16,964
  • 16
  • 76
  • 97
  • remember the source code to be changed is in C, not Python. If it where Python, adding a second parameter with a default value would be trivial. – jsbueno Mar 07 '12 at 14:20