1

Consider I have a SQL query like this:

SELECT c_mandant, 
    hist_datum, 
    parkey1, 
    parkey2, 
    funktionscode, 
    ma_parkey, 
    me_parkey , 
    CASE 
        WHEN EXISTS 
            ( 
           SELECT 1 
           FROM   cds_h_gruppe GRP1 
           WHERE  grp1.c_mandant = c_mandant 
           AND    grp1.hist_datum = Add_months(Last_day(Trunc(sysdate)), -1)
           AND    grp1.funktionscode = 'H' 
           AND    grp1.parkey1 = ma_parkey
       ) THEN 1 
        ELSE NULL 
    END ma_me , 
    CASE 
        WHEN EXISTS 
            ( 
           SELECT 1 
           FROM   cds_h_gruppe GRP2 
           WHERE  grp2.c_mandant = c_mandant 
           AND    grp2.hist_datum = Add_months(Last_day(Trunc(sysdate)), -1)
           AND    grp2.funktionscode = 'U' 
           AND    grp2.parkey1 = me_parkey
       ) THEN 1 
        ELSE NULL 
    END me_ma,
    Row_number() OVER (partition BY c_mandant, ma_parkey, me_parkey ORDER BY c_mandant, ma_parkey, me_parkey) anz_ma
FROM     
( 
     SELECT c_mandant, 
     hist_datum, 
     parkey1, 
     parkey2, 
     funktionscode , 
     CASE 
       WHEN funktionscode = 'U' THEN parkey1 
       ELSE parkey2 
     END ma_parkey , 
     CASE 
       WHEN funktionscode = 'U' THEN NULL 
       ELSE parkey1 
     END me_parkey 
     FROM   cds_h_gruppe 
     WHERE  funktionscode IN ('U', 
             'H') 
     AND    hist_datum = Add_months(Last_day(Trunc(sysdate)), -1)

I want to extract table names and column names from the query. The result should look some thing like

Tables:

CDS_H_GRUPPE

Columns:

CDS_H_GRUPPE.c_mandant    
CDS_H_GRUPPE.funktionscode    
CDS_H_GRUPPE.hist_datum    
CDS_H_GRUPPE.parkey1 
CDS_H_GRUPPE.parkey2

Can I extract column names and table names from any complex query string, which is valid for Oracle, SQL server or DB2?

Dale K
  • 25,246
  • 15
  • 42
  • 71
dumbdragon
  • 11
  • 5
  • I would recommend working backwards by Keying off words like `SELECT`, `UPDATE`, and `DELETE`, then delimiting off commas and the `FROM` keyword. Good Luck – Oxymoron Nov 08 '16 at 05:51
  • I don't see any relationship between that SQL and the output you're looking for. Why is `ma_me` not included? – Rob Nov 08 '16 at 06:08

2 Answers2

0

Don't try to parse it yourself, use an SQL Parser. Apparently, Entity Framework used to have one, but no longer does. There seem to be a few more, such as this one from the Irony Project.

There's a commercial parser that you can get for $900 ($1,900 if you want to distribute your code apparently).

Dale K
  • 25,246
  • 15
  • 42
  • 71
zmbq
  • 38,013
  • 14
  • 101
  • 171
0

Lexing partial SQL in C#

I think you can probably do it using SqlParser 'Carbunql'.

https://github.com/mk3008/Carbunql

For example, select clause information can be obtained as follows:

using Carbunql;

var sq = new SelectQuery(@"SELECT c_mandant, hist_datum, ...");

Console.WriteLine("columns");
sq.SelectClause!.Items.ForEach(item =>
{
    Console.Write("    Alias : " + item.Alias);
    Console.Write(", Type : " +  item.Value.GetType().Name);  

    if (item.Value is ColumnValue c)
    {
        Console.Write(", Table : " + c.TableAlias);
        Console.Write(", Column : " + c.Column);
    }
    Console.WriteLine();
});

The execution result looks like this.

columns
    Alias : c_mandant, Type : ColumnValue, Table : , Column : c_mandant
    Alias : hist_datum, Type : ColumnValue, Table : , Column : hist_datum
    Alias : parkey1, Type : ColumnValue, Table : , Column : parkey1
    Alias : parkey2, Type : ColumnValue, Table : , Column : parkey2
    Alias : funktionscode, Type : ColumnValue, Table : , Column : funktionscode
    Alias : ma_parkey, Type : ColumnValue, Table : , Column : ma_parkey
    Alias : me_parkey, Type : ColumnValue, Table : , Column : me_parkey
    Alias : ma_me, Type : CaseExpression
    Alias : me_ma, Type : CaseExpression
    Alias : anz_ma, Type : FunctionValue

Although rough information can be obtained easily, it is necessary to elaborate to obtain the referenced table and column names.

Here are some areas that need to be worked on.

・Specify the table name in the SELECT clause of the SQL statement It may be possible to infer from the FROM clause, but it would be safer to explicitly state

example

select t.c_mandant from CDS_H_GRUPPE as t

・Calculation formulas require recursive analysis "c_mandant" is parsed as a ColumnValue class, so it's easy to get the column name, but you need to find out what columns the other classes refer to internally.

・Requires reverse lookup from table alias to table The ColumnValue class can get the alias name of the table, not the table name. A reverse lookup from the alias name to the table is required. However, if a subquery is specified in the FROM clause, the subquery must be parsed recursively to find the physical table name.