I'm working on a piece of Python code using regular expressions and I'm trying to achieve the following:
If I have a piece of SQL as a string that contains some code in []
and a WHERE
clause in front of it, I want to remove the complete WHERE
clause. For example:
where this condition and [this = 1] group by 1,2,3
becomes
group by 1,2,3
The code I'm using is:
txt = """where [this = 1] group by"""
txt = re.sub("where.*\[.*\].*group" , 'group', txt, flags = re.S|re.I)
However, if I have another WHERE
clause before this one the whole regex doesn't work as expected, e.g.:
txt = """where that and that do that where [this = 1] group by 1,2,3"""
txt = re.sub("where.*\[.*\].*group" , 'group', txt, flags = re.S|re.I)
produces
group by 1,2,3
instead of
where that and that do that group by 1,2,3
Edit: the solution should also work for a scenario like this:
txt = """where that and that do that where [this = 1] and [that = 1] group by 1,2,3"""
outputs:
"""where that and that do that group by 1,2,3"""
So it removes the inner (closest to []
) WHERE
clause and all code containing at least one []
until the next GROUP
, ORDER
or end of string
.
Finally, the solution needs to takes care of cases in which there are multiple such where .. [...]
snippets in the string.
txt = """where [Include_Fakes|TRUE] group by 1 order by 1,3 ) where signed_up_date >= dateadd('[aggregation]', -[Last_N_Periods|12], CURRENT_DATE)"""
expected output:
group by 1 order by 1,3 )
Can somebody point me in the right direction here? Any help would be appreciated.