0

I'm trying to analyze some SQL using python. I've got SQL that might contain multiple "on" clauses specifying table joins.

e.g

"on tableA.user_id = tableB.id"

I'd like to parse out all of the join relationships.

I tried the following but am having difficulty detecting multiple ones. It just finds the first one and stops there. Can someone help?

pattern = re.compile(r'on\s([^"\s]*)\s*=\s*([^"\s]*)')
results = re.search(pattern, sql)
for result in results.groups():
     print result
user491880
  • 4,709
  • 4
  • 28
  • 49

2 Answers2

0

If you want to stick to python, look into the (very good) python regex docs here. It specifically says that re.search only matched one occurrance.

You probably want to use re.match, which will return none if it did not match, and a match object containing all occurrances otherwise.

cleros
  • 4,005
  • 1
  • 20
  • 30
0

I would use the re.findall() function.

Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.

fizzyh2o
  • 1,237
  • 9
  • 18