1

What pattern can I use to split a string like this:

f.id AS id, CONCAT(a1.id, a2.id, a3.id) AS cnp, SUM(A3.nr) AS sum

in such a way that the result is an array of 3 groups like this:

  1. f.id AS id
  2. CONCAT(a1.id, a2.id, a3.id) AS cnp
  3. SUM(A3.nr) AS sum

Can I match a comma that is not enclosed by parentheses?

DaoWen
  • 32,589
  • 6
  • 74
  • 101
Victorqedu
  • 484
  • 4
  • 20

5 Answers5

2

The pattern appears to always take the format ... AS ... and you can just use a regular expression to match that:

Pattern p = Pattern.compile("(.*? as .*?)(,|$)", Pattern.CASE_INSENSITIVE );
String query = "f.id AS id, CONCAT(a1.id, a2.id, a3.id) AS cnp, SUM(A3.nr) AS sum";
Matcher m = p.matcher( query );
while ( m.find() ){
    System.out.println( m.group(1) );
}

IDEONE

So long as you are not expecting any correlated sub-queries to be nested in your select values (or other edge cases such as strings containing ' as error,' AS id, ...) then this ought to work for inputs similar to your format.

MT0
  • 143,790
  • 11
  • 59
  • 117
0

Probably there is a killer regular expression for this, but what would be more maintanable could be to:

  1. Temporarily set placeholders on blocks between parentheses
  2. Split the result on the desired separator
  3. Replace the placeholders with their original values

To make step 1 more general, you should insert placeholders at sections where the separator should not function. As long as you are able to accurately determine what those sections are, you could apply this recipe.

Boris van Katwijk
  • 2,998
  • 4
  • 17
  • 32
0

Using an actual SQL Parser, as suggested by @KevinEsche, is probably the most robust choice.

However, if you don't require parsing of all SQL expressions, I would just use plain old char matching: go through the string a character at a time, counting how deeply nested in the brackets you are:

List<String> parts = new ArrayList<>();
int i = 0;
int depth = 0;
while (i < str.length()) {
  int start = i;
  while (i < str.length()) {
    char ch = str.charAt(i);
    if (ch == '(') {
      depth++;
    } else if (ch == ')') {
      depth--;
    } else if (ch == ',' && depth == 0) {
      break;
    }
    i++;
  }
  // Maybe check that depth == 0 here.
  parts.add(str.substring(start, i));
  i++;  // To skip the comma.
}
Andy Turner
  • 137,514
  • 11
  • 162
  • 243
0

Thank you for your answers. I tried to vote but I can't yet. I used look ahead pattern to solve the problem:

String pattern = ",(?!([^(]*\\)))";
String str = "f.id AS id, CONCAT(a1.id, a2.id, a3.id) AS cnp, SUM(A3.nr) AS sum";
String strg [] = str.split(pattern);
for(int i=0;i<strg.length;i++) {
    System.err.println("Group "+i+" is "+strg[i]);
}

And the result is:

Group 0 is f.id AS id

Group 1 is CONCAT(a1.id, a2.id, a3.id) AS cnp

Group 2 is SUM(A3.nr) AS sum

Victorqedu
  • 484
  • 4
  • 20
0

In the end is too complicated to write a SQL Parser so I decided to use ANTLR4.

I used the example from here and works fine. https://github.com/bkiers/sqlite-parser

But I don't know how to extract only some parts of the query(select, joins, order...) and I can't find any examples online. Can someoane show how this is done?

Thank you.

Victorqedu
  • 484
  • 4
  • 20