Context: Python 3.4.3
I'm not very good with regular expressions, and I can't seem to figure out a robust solution to this using re
.
Suppose we have a long patsy formula and somewhere in the middle is an expression like:
... + xvar + np.log(xvar)+xvar**2 + xvar2+ z...
Patsy formulas are just strings that follow well-behaved rules, so I'm wondering if anyone has written / can easily write a robust method for dropping specific terms from a given formula? So, for example:
>>> remove_term(long_formula, 'xvar')
... + np.log(xvar)+xvar**2 + xvar2+ z...
and
>>> remove_term(long_formula, 'xvar2')
... + xvar + np.log(xvar)+xvar**2 + z...
etc. This would need to also be robust to having a variable at the beginning / end of the right-hand side formula spec.
My limited regex-foo only produces things like:
re.sub('[^(]\s*xvar\s*',' FOUND IT ', 'y ~ xvar + np.log(xvar)')
Maybe a semi-complicated if/else re.sub
situation?