2

I want to parse a complex SQL which has (inner join,outer join) and get the table names used in the SQL.

I am able to get the table names if it is simple select but if the SQL has inner join ,left join like below then the result is giving only the first table.

select * from xyz  inner join dhf  on df = hfj  where z > 100 

I am using the program similar what is present in the below link by Paul.

http://pyparsing.wikispaces.com/file/view/select_parser.py/158651233/select_parser.py

Can someone tell me how to get all the tables used in a SQL like below

select * from xyz  inner join dhf  on df = hfj  where z > 100.  
Jonathan Allard
  • 18,429
  • 11
  • 54
  • 75
  • This may be a duplicate of http://stackoverflow.com/q/35295458/409172 That solution requires a live database and a PL/SQL stored procedure to do most of the work, I'm not sure if that's feasible for you. But that's probably the only way to correctly parse *complex* SQL. Even non-trivial Oracle SQL is almost impossible to parse. With 2175 keywords, most of them not reserved, parsing Oracle SQL is a huge task. That's why you need a shortcut, like using the `EXPLAIN PLAN` method in that answer. – Jon Heller Aug 30 '16 at 06:22
  • Pyparsing is no longer hosted on wikispaces.com. Go to https://github.com/pyparsing/pyparsing – PaulMcG Aug 27 '18 at 12:43

2 Answers2

1

This parser was written a long time ago, and handling multiple values in a results name did not come along until later.

Change this line in the parser you cited:

single_source = ( (Group(database_name("database") + "." + table_name("table")) | table_name("table")) + 

to

single_source = ( (Group(database_name("database") + "." + table_name("table*")) | table_name("table*")) + 

When I run your sample statement thru the select_stmt parser, I now get this:

select * from xyz  inner join dhf  on df = hfj  where z > 100
['SELECT', ['*'], 'FROM', 'xyz', 'INNER', 'JOIN', 'dhf', 'ON', ['df', '=', 'hfj'], 'WHERE', ['z', '>', '100']]
- columns: ['*']
- table: [['xyz'], ['dhf']]
  [0]:
    ['xyz']
  [1]:
    ['dhf']
- where_expr: ['z', '>', '100']
PaulMcG
  • 62,419
  • 16
  • 94
  • 130
  • Thanks Paul for the reply it is exactly working as expected .Iam getting all the table names from the SQL. – user6771430 Aug 30 '16 at 17:02
  • Is it possible to get all the join columns also from the SQL query? – user6771430 Aug 30 '16 at 17:02
  • Why "table*"? I can sort of tell it allows more than one, but I can't find any docs on it. – dfrankow Nov 27 '17 at 15:09
  • The behavior is described in the docs for `setResultsName` (https://pythonhosted.org/pyparsing/pyparsing.ParserElement-class.html#setResultsName) under the `listAllMatches` argument, and then the '*' is explained in the `__call__` docstring (https://pythonhosted.org/pyparsing/pyparsing.ParserElement-class.html#__call__) – PaulMcG Nov 27 '17 at 18:53
-1

Your question is going to depend on what Sql platform you are using.

I will answer assuming you are using MsSql. The same logic should be able to be done on all Sql platforms thought the syntax changes though.

Tables are unique by a combination of Owner and Table. I do a select that returns #Owner#TableName# in a Python script that I wrote to extract all data in all tables to text files. The basic form of this assuming you do not have multiple tables of the same name with a different owner is:

Select name from SysObjects where xtype = 'U' order by name

This gives you a list of all tables. Then you take that list and do a "Select * from [table name from other query]" looping through till you have all the tables that you found when you selected from Sysobjects.

Same type of thing is practical on all Sql Platforms assuming you have access to the system tables.

M T Head
  • 1,085
  • 9
  • 13
  • Selecting select * from syscolumns can give you the column names. – M T Head Aug 29 '16 at 19:47
  • You misread the question. The OP does not want to query the db for the table names, they want to extract the table names from the posted SQL statement. – PaulMcG Aug 29 '16 at 23:45