1

I want to parse a Teradata ANSI sql and get the join columns and tables used in the query.

For example below is the query

select * from
TABLEA A INNER JOIN TABLEB b ON A.COL1 =B.COL2 INNER JOIN TABLEc C ON A.COL1 =C.COL3

I want to get the result like below

Tables :tableA ,tableb, tablec columns = a.col1,b.col2,c.col3

I already have the code to get all the tablenames from a SQL. Iam using the below script from Paul to parse the tablename I want to modify the below script to get the join column names from the SQL

join_constraint = Optional(ON + expr | USING + LPAR + Group(delimitedList(column_name)) + RPAR)

join_op = COMMA | (Optional(NATURAL) + Optional(INNER | CROSS | LEFT + OUTER | LEFT | OUTER) + JOIN)

join_source = Forward()
single_source = ( (Group(database_name("database") + "." + table_name("table*")) | table_name("table*")) +
                    Optional(Optional(AS) + table_alias("table_alias")) +
                    Optional(INDEXED + BY + index_name("name") | NOT + INDEXED)("index") |
                  (LPAR + select_stmt + RPAR + Optional(Optional(AS) + table_alias)) |
                  (LPAR + join_source + RPAR) )

join_source << single_source + ZeroOrMore(join_op + single_source + join_constraint)

result_column = "*" | table_name + "." + "*" | (expr + Optional(Optional(AS) + column_alias))
select_core = (SELECT + Optional(DISTINCT | ALL) + Group(delimitedList(result_column))("columns") +
                Optional(FROM + join_source) +
                Optional(WHERE + expr("where_expr")) +
                Optional(GROUP + BY + Group(delimitedList(ordering_term)("group_by_terms")) +
                        Optional(HAVING + expr("having_expr"))))

select_stmt << (select_core + ZeroOrMore(compound_operator + select_core) +
                Optional(ORDER + BY + Group(delimitedList(ordering_term))("order_by_terms")) +
                Optional(LIMIT + (integer + OFFSET + integer | integer + COMMA + integer)))

tests = """\
    select * from xyzzy inner join dhfjfjfj on df = hfj  inner join abc on dg = hj inner join cde on cv = nj  where z > 100
    select * from xyzzy,hfhfhf  where z > 100 order by zz
    select * from xyzzy""".splitlines()
for t in tests:
    print t
    try:
        print select_stmt.parseString(t).dump()
    except ParseException, pe:
        print pe.msg
    print
  • Locate the elements in the grammar that are the parts you want to retrieve, and give them a name like "join_columns*". The trailing '*' will make sure that you get all the values, not just the last one. You should be able to take it from here. – PaulMcG Aug 31 '16 at 23:28

0 Answers0