0

I'm looking for a regular expression, implemented in Python, that will match on this text

WHERE PolicyGUID = '531B2310-403A-13DA-5964-E2EFA56B0753' 

but will not match on this text

WHERE AsPolicy.PolicyGUID = '531B2310-403A-13DA-5964-E2EFA56B0753' 

I'm doing this to find places in a large piece of SQL where the developer did not explicitly reference the table name. All I want to do is print the offending lines (the first WHERE clause above). I have all of the code done except for the regex.

Mike
  • 13
  • 1
  • 2
    What have you tried? `WHERE PolicyGUID = '531B2310-403A-13DA-5964-E2EFA56B0753'` would do it. – nmichaels Aug 04 '11 at 16:15
  • Why not just use [the in operator](http://stackoverflow.com/questions/3437059/does-python-have-a-string-contains-method)? – Michael Todd Aug 04 '11 at 16:17
  • The lines will be in a similar format but will not all be related to PolicyGuid. They will be in the form of a WHERE clause where the term before the equals sign does not contain the reference to a table. So, it will not contain a period. – Mike Aug 04 '11 at 16:19

4 Answers4

2
re.compile('''WHERE [^.]+ =''')

Here, the [] indicates "match a set of characters," the ^ means "not" and the dot is a literal period. The + means "one or more."

Was that what you were looking for?

Eli Stevens
  • 1,447
  • 1
  • 12
  • 21
0

something like

WHERE .*\..* = .*

not sure how accurate can be, it depends on how your data looks... If you provide a bigger sample it can be refined

ksn
  • 623
  • 2
  • 6
  • 18
0

Something like this would work in java, c#, javascript, I suppose you can adapt it to python:

/WHERE +[^\.]+ *\=/

scrat.squirrel
  • 3,607
  • 26
  • 31
0
>>> l
["WHERE PolicyGUID = '531B2310-403A-13DA-5964-E2EFA56B0753' ", "WHERE AsPolicy.P
olicyGUID = '531B2310-403A-13DA-5964-E2EFA56B0753' "]
>>> [line for line in l if re.match('WHERE [^.]+ =', line)]
["WHERE PolicyGUID = '531B2310-403A-13DA-5964-E2EFA56B0753' "]
Dogbert
  • 212,659
  • 41
  • 396
  • 397