1

I am trying to parse a SQL query using:

SqlDataReader reader = command.ExecuteReader(CommandBehavior.KeyInfo);
DataTable schemaTable = reader.GetSchemaTable();

I get base table names, but I also need to find the table alias names.

Example Query:

select AuthorId, a.Name as [AuthorName], c.Name as City, s.Name as [State] from Author a
inner join Zipcode zc on zc.ZipCodeId = a.ZipCodeId
inner join City c on c.CityId = zc.CityId
inner join [State] s on s.StateId = c.StateId

I have checked regex solution, but not able to figure out how to extract "Author a", "ZipCode cd", "City c", "[State] s"

  • Changing thoughts here, I am taking a different route, got idea from here http://www.andriescu.nl/sql/sql-how-to-parse-microsoft-transact-sql-statements-in-c_sharp_view_column_binding/ – Debjyoti Das Oct 24 '17 at 13:42
  • And here is my project https://github.com/eyedia/aarbac – Debjyoti Das Dec 07 '17 at 15:46

1 Answers1

0

Definitions that you must take into account to work with Regex type syntax.

Metacharacters

Metacharacters specify characters users can enter at the corresponding positions.

Character Descriptión
. Any single character.
[aeiou] Any single character from the specified character set.
[^aeiou] Any single character not from the specified character set.
[0-9a-fA-F] Any single character from the specified character range.
\w Any single alphanumeric character. The same as [a-zA-Z_0-9]
\W Any single non-alphanumeric character. The same as [^a-zA-Z_0-9]
\d Any single numeric character. The same as [0-9]
\D Any single non-numeric character. The same as [^0-9]
Quantifiers

Quantifiers follow a metacharacter and specify how many times the character should be repeated. The table below lists the available qualifiers.

Quantifier Description Examples
* Zero or more matches. The same as {0,} [a-zA-Z], \w
+ One or more matches. The same as {1,} [a-zA-Z]+, \w+
? Zero or one matches. The same as {0,1} [a-zA-Z]?, \w?
{n} Exactly n matches. [0-9]{2}
{n,} At least n matches. [0-9]{2,}
{n,m} At least n, but not more than m matches. [0-9]{2,7}

Solutions that can be of help.

1. Extract table names from an SQL statement with Regex

Regular expression

/(from|join|into)\s+([`]*\w*.*\w[`]|(\[)*\w*.*(\])|\w*\.*\w*)/g
2. Extract table names with alias from an SQL statement with Regex

Regular expression

(from|join|into)\s+([`]*\w*.*\w[`] *\w*|(\[)*\w*.*(\]) *\w*|\w*\.*\w* *\w*)
3. Extract column names from an SQL statement with Regex

Regular expression

/(\w*\.*\w+|`\w*.*\w`|(\[)\w*.*(\]))+(,|\s+,|\s+FROM|\s+from)/g

Generated Code for C#

public static class QueryExtension
    {
        public static List<string> GetTables(this string query)
        {
            List<string> tables = new List<string>();
            string pattern = @"(from|join|into)\s+([`]*\w*.*\w[`]|(\[)*\w*.*(\])|\w*\.*\w*)";            
            
            foreach (Match m in Regex.Matches(query, pattern))
            {                
                string name = m.Groups[2].Value;                                
                tables.Add(name);
            }

            return tables;
        }
        public static List<string> GetTablesWithAliases(this string query)
        {
            List<string> tables = new List<string>();
            string pattern = @"(from|join|into)\s+([`]*\w*.*\w[`] *\w*|(\[)*\w*.*(\]) *\w*|\w*\.*\w* *\w*)";

            foreach (Match m in Regex.Matches(query, pattern))
            {
                string name = m.Groups[2].Value;
                tables.Add(name);
            }

            return tables;
        }
        public static List<string> GetColumns(this string query)
        {
            List<string> columns = new List<string>();
            string pattern = @"(\w*\.*\w+|`\w*.*\w`|(\[)\w*.*(\]))+(,|\s+,|\s+FROM|\s+from)";

            foreach (Match m in Regex.Matches(query, pattern))
            {
                string name = m.Groups[1].Value;
                columns.Add(name);
            }

            return columns;
        }
        public static string Join(this IEnumerable<string> values, string separator) {
            return string.Join(separator, values);
        }
    }

Test string

-------------------------------------------------------
select AuthorId, a.Name as [AuthorName], c.Name as City, s.Name as [State] from Author a
inner join `dbo`.`otherTable` ot on ot.col1 = a.Name
inner join Zipcode zc on zc.ZipCodeId = a.ZipCodeId
inner join City c on c.CityId = zc.CityId
inner join [State] s on s.StateId = c.StateId
-------------------------------------------------------

Output

//-------GetTables------
Author
`dbo`.`otherTable`
Zipcode
City
[State]
//-------GetTablesWithAliases------
Author a
`dbo`.`otherTable` ot
Zipcode zc
City c
[State] s

More code sample for C# (Here)
https://stackoverflow.com/a/68889908/16731336

References

Extract table names from an SQL statement with Regex
Regular Expression Language - Quick Reference
Simplified Regular Expressions