I wrote a regex to capture table name(s) from a sql query on AWS Athena
(?<=from|join|table|into|exists|update) (?!IF)(\w+\.*)+
but it fails when there are multiple table names separated by comma like so
Select distinct W.WORKER_ID, W.FIRST_NAME, W.Salary
from Table1 W, Table2 W1
where W.Salary = W1.Salary
and W.WORKER_ID != W1.WORKER_ID;
in this case it catches only Table1
. I can't seem to find any way to capture Table2
(or in some case Table3
)
I tried
(?<=from|join|table|into|exists|update) (?!IF)(\w+\.*)+|(?:\s+\w*\,\s*(\w+\.*)+)
but this captures column names too, which means conditional part (?<=from...)
is not working on it. How can i fix this? I need to use the captured group 1 somehow to capture other tables.
Here are some of the strings i am testing my regex against https://regex101.com/r/0dh3XV/2
PS: I read get table name from query using a regex and get table name from query but the solutions aren't working for me.
EDIT: No I do not need any sql parser so its not a duplicate