1

I want to split my query, but didn't get the answer which fits my requirement exactly.

I have my string like below :

select 1;select \\2; select 3\\;copy customer from 's3://mybucket/mydata' credentials 'aws_access_key_id=access_key\\;aws_secret_access_key=secret_key\\;master_symmetric_key=master_key'

Desired output :

select 1

select \\2

select 3\\

copy customer from 's3://mybucket/mydata' credentials 'aws_access_key_id=access_key\\;aws_secret_access_key=secret_key\\;master_symmetric_key=master_key'

I found solution about escaper. But it doesn't fit my requirement.

(?<!\\);

Handling delimiter with escape characters in Java String.split() method

How to ignore escape+semicolon in quotes?

Help me.

Community
  • 1
  • 1
sproutee
  • 25
  • 6
  • You also have a `\\;` before the `copy customer`. There is no difference between that and `access_key\\;`. No matter what you use, a computer cannot distinguish these two `\\;`. – RealSkeptic Oct 26 '16 at 07:10

4 Answers4

1

I think that's a sollution:

String line = "select 1;select \\2; select 3\\;copy customer from 's3://mybucket/mydata' credentials 'aws_access_key_id=access_key\\;aws_secret_access_key=secret_key\\;master_symmetric_key=master_key'";
line = line.replace("\\","\\\\");//To avoid missing \
String[] tokens = line.split(";(?=([^']*'[^']*')*[^']*$)");//To split on semmicolons, but not those inside quotes
for(String t : tokens) {
    System.out.println("> "+t);
}

You can test it here http://rextester.com/MLTA75734

Theodore K.
  • 5,058
  • 4
  • 30
  • 46
  • Thank you for help. But if only semicolon in quotes case, I don't want to ignore that. ex) `copy credentials 'a_key;b_key;c_key';` Desired output : `copy credentials 'a_key(\n)b_key(\n)c_key'(\n)` (\n) : mean split – sproutee Oct 27 '16 at 07:15
0

You can user external .jar like, commons-lang-2.6.jar

String str = "select 1;select \\2; select 3\\;copy customer from 's3://mybucket/mydata' credentials 'aws_access_key_id=access_key\\;"
            + "aws_secret_access_key=secret_key\\;"
            + "master_symmetric_key=master_key'";
    str = StringEscapeUtils.escapeJavaScript(str); // method from external jar
    String st[] = str.split(";");
    for(int i=0;i<st.length;i++)
    System.out.println(st[i]);

Hope it helps you...

PSabuwala
  • 155
  • 1
  • 9
0

I tryied another sollution, no regex this time. I checked this with as many wierd strings I could think of and it worked as I expected (hopefully it will work as you expect too this time), please check this out.

       String s ="select 1;r;select \\2; select 3\\;copy customer from 's3://mybucket/mydata' credentials 'aws_access_key_id=<access-key-id>\\;aws_secret_access_key=<secret-access-key>\\;master_symmetric_ke‌​y=<master-key>'";
               //"select 1;r;select \\2; select 3\\;copy customer from 'r;s3://mybucket/mydata;r' credentials 'a_key;b_key;c_key\\;r' 'aws_access_key_id=access_key\\;aws_secret_access_key=secret_key\\;master_symmetric_key=master_key'";
       s = s.replace("\\","\\\\");
       List<String> tokens = new ArrayList<String>();               
       int i = 0;    
       int j = 0;
       String backup = s;
       while (i < s.length()){
        char c  = s.charAt(i);      
          if(c==';'){
            String previous = s.substring(0,i);
            int quotesBefore = StringUtils.countMatches(backup.substring(0,j), "'");
            if(i<2 || quotesBefore==0 || (i>1 && (quotesBefore & 1) == 0 || ((quotesBefore & 1) != 0) && !(s.charAt(i-1)=='\\' && s.charAt(i-2)=='\\'))){//Even quotes before OR (odd quotes AND not \\ right before)                 
                tokens.add(previous);
                if(i>0)s=s.substring(i+1);
                i=0;
            }
          }
          i++;j++;
        }
        tokens.add(s);
        for(String t : tokens) {
            System.out.println("> "+t);
        }

Basic steps:

  1. Itterate string characters

  2. For each one check if it's a semicolon

  3. If that's true, get the characters before that, count the quotes and add those characters to the list only if that's an odd number or if it's an even number but semicolon is not escaped with "\\"
Theodore K.
  • 5,058
  • 4
  • 30
  • 46
0

I have used below solution for generic sting splitter with quotes(' and ") and escape(\) character.

public static List<String> split(String str, final char splitChar) {
    List<String> queries = new ArrayList<>();
    int length = str.length();
    int start = 0, current = 0;
    char ch, quoteChar;
    
    while (current < length) {
        ch=str.charAt(current);
        // Handle escape char by skipping next char
        if(ch == '\\') {
            current++;
        }else if(ch == '\'' || ch=='"'){ // Handle quoted values
            quoteChar = ch;
            current++;
            while(current < length) {
                ch = str.charAt(current);
                // Handle escape char by skipping next char
                if (ch == '\\') {
                    current++;
                } else if (ch == quoteChar) {
                    break;
                }
                current++;
            }
        }else if(ch == splitChar) { // Split sting
            queries.add(str.substring(start, current + 1));
            start = current + 1;
        }
        current++;
    }
    // Add last value
    if (start < current) {
        queries.add(str.substring(start));
    }
    return queries;
}

public static void main(String[] args) {

    String str = "select 1;select \\\\2; select 3\\\\;copy customer from 's3://mybucket/mydata' credentials 'aws_access_key_id=access_key\\\\;aws_secret_access_key=secret_key\\\\;master_symmetric_key=master_key'";
    List<String> queries = split(str, ';');
    System.out.println("Size: "+queries.size());
    for (String query : queries) {
        System.out.println(query);
    }
}

Getting result

Size: 4
select 1;
select \\2;
 select 3\\;
copy customer from 's3://mybucket/mydata' credentials 'aws_access_key_id=access_key\\;aws_secret_access_key=secret_key\\;master_symmetric_key=master_key'
Nikunj
  • 1
  • 1