OK, this might seem too tough to be posted here so I beg your pardon. Been working on this for almost a week.
I need to extract all selected columns in a given Oracle SQL String. It should pass the following test cases:
// single column test
select col1 from dual
// ^ should match "col1"
// multiple column test
select col1,col2 from dual
// ^ should match "col1", "col2"
// multiple space test
select col1 , col2 from dual
// ^ should match "col1", "col2"
// "distinct" tests
select distinct col1 from dual
// ^ should match "col1"
select distinct col1, col2 from dual
// ^ should match "col1", "col2"
// "distinct" with whitespaces tests
select distinct col1 from dual
// ^ should match "col1"
select distinct col1 , col2 from dual
// ^ should match "col1", "col2"
// "as" tests
select col1 from dual
// ^ should match "col1"
select colA as col1 from dual
// ^ should match "col1"
select colA as col1, col2, col3 from dual
// ^ should match "col1", "col2", "col3"
select col1, colB as col2, col3 from dual
// ^ should match "col1", "col2", "col3"
select col1, col2, colC as col3 from dual
// ^ should match "col1", "col2", "col3"
// "as" tests with whitespaces tests
select colA as col1, colB as col2, colC as col3 from dual
// ^ should match "col1", "col2", "col3"
// "distinct" with "as" tests
select distinct colA as col1 from dual
// ^ should match "col1"
select distinct colA as col1, colB as col2, col3 from dual
// ^ should match "col1", "col2", "col3"
select distinct colA as col1, col2, colC as col3 from dual
// ^ should match "col1", "col2", "col3"
// function test
select funct('1','2') as col1 from dual
// ^ should match "col1"
select col1, funct('1','2') as col2 from dual
// ^ should match "col1", "col2"
select col1, colB as col2, funct('1','2') as col3 from dual
// ^ should match "col1", "col2", "col3"
I tried the following RegEx in Java
((?<=select\ )(?!distinct\ ).*?(?=,|from))
((?<=select\ distinct\ ).*?(?=,|from))
((?<=as\ ).*?(?=,|from))
((?<=,\ ).*?(?=,|from))(?!.*\ as\ ) // <- Right, I'm guessing here
OR-ed them together but I can't simply pass all the test cases above. (I'm using this tool to validate my Regex).
I tried searching for SQL evaluator but can't find any that extracts all columns without executing it against a real database and that assumes all referenced tables and functions exist.
A Java ReGex, a free SQL Evaluator (that doesn't need a real database) that can pass the tests, or anything better that those two are the acceptable answers. The assumption is that the SQL is always in Oracle 11g format.