0

I wrote a regex to capture table name(s) from a sql query on AWS Athena

(?<=from|join|table|into|exists|update) (?!IF)(\w+\.*)+

but it fails when there are multiple table names separated by comma like so

Select distinct W.WORKER_ID, W.FIRST_NAME, W.Salary 
from Table1 W, Table2 W1 
where W.Salary = W1.Salary 
and W.WORKER_ID != W1.WORKER_ID;

in this case it catches only Table1. I can't seem to find any way to capture Table2 (or in some case Table3)

I tried

(?<=from|join|table|into|exists|update) (?!IF)(\w+\.*)+|(?:\s+\w*\,\s*(\w+\.*)+)

but this captures column names too, which means conditional part (?<=from...) is not working on it. How can i fix this? I need to use the captured group 1 somehow to capture other tables.

Here are some of the strings i am testing my regex against https://regex101.com/r/0dh3XV/2

PS: I read get table name from query using a regex and get table name from query but the solutions aren't working for me.

EDIT: No I do not need any sql parser so its not a duplicate

Ashvin Sharma
  • 563
  • 1
  • 5
  • 24
  • 3
    Use an SQL parser. Here are some examples: https://stackoverflow.com/questions/660609/sql-parser-library-for-java – Stephen C Jun 27 '19 at 11:22
  • Try also `(?:\G(?!\A),\s*|(?:from|join|table|into|exists|update))\s+\b(?!IF\s)(\w+)\s+(\w+)`. See https://regex101.com/r/1I8oZj/1 – Wiktor Stribiżew Jun 27 '19 at 11:22
  • @WiktorStribiżew nope, doesn't work :( – Ashvin Sharma Jun 27 '19 at 11:24
  • @StephenC I can't add any library, work rules :( – Ashvin Sharma Jun 27 '19 at 11:25
  • @AshvinSharma It does, you are using it wrong. Post the code. Explain what you need to get in the end. – Wiktor Stribiżew Jun 27 '19 at 11:26
  • @WiktorStribiżew edited in answer – Ashvin Sharma Jun 27 '19 at 11:28
  • But you have not added the code, nor the exact result you need. Again, https://regex101.com/r/1I8oZj/1 regex works because it captures `Table1` and `Table2` in Group 1. You can't say it does not work. – Wiktor Stribiżew Jun 27 '19 at 11:28
  • it works for one string but doesn't cover all cases. – Ashvin Sharma Jun 27 '19 at 11:30
  • 3
    There is nothing (apart from common sense!) stopping you from designing, coding, testing an SQL parser from scratch. If your company insists on placing unreasonable constraints on how you do your job, then they should be prepared to wear the cost of you spending lots of time on "reinventing the wheel" projects. You can't reliably parse a language as complex as SQL using regexes and string bashing. Talk to your project manager ... – Stephen C Jun 27 '19 at 11:40
  • it is soo hard to parse an sql query correctly that in my opinion, trying to do it with a regexp is like trying to be a surgeon with a swiss knife... It might work, but you are pretty sure it won't... Use a parser! Make your firm accept external libraries (who the hell refuse to use open source libraries??)! – spi Jun 27 '19 at 11:41

0 Answers0