1

I am trying to split commands using semicolon in Java. The semicolons within the single or double quotes should be ignored and commands are separated based on semicolons not within the single or double quotes.

Example command: echo 'he ; llo' ; echo 'hello;;'

I tried the following code but the command doesn't get split correctly:

String[] tokens = cmdline.split(";(?=(?:[^\"\']*\"\'[^\"\']*\"\')*[^\"\']*$)", -1);

Current incorrect splitting:

echo 'he ; llo' ; echo 'hello;;'

Expected to split into:

1) echo 'he ; llo'
2) echo 'hello;;'
stackyyflow
  • 767
  • 3
  • 11
  • 30
  • 1
    you can use library OpenCsv like this `CSVParser parser = new CSVParser(';', '\'');` `String[] strings = parser.parseLine(cmd);`. It will give you the desired result. – eatSleepCode Feb 21 '17 at 04:06
  • As hinted to by @eatSleepCode, this is a job for a parser, not for a regular expression. @eatSleepCode I don't think CSVParser would honour quoting of part of a field as in `echo 'hello;;'`. – Amadan Feb 21 '17 at 04:09
  • @Amadan Do you mean CSVParser won't consider `echo 'hello;;'` as one String for splitting? – eatSleepCode Feb 21 '17 at 04:27
  • Without trying I don't really know; but my intuition is that a CSV field can either be quoted (quote at start and end) or not; and if it is not, then `;` inside would be taken as a field separator. Have you tried it? – Amadan Feb 21 '17 at 04:40

5 Answers5

1

If You want just for basic try this

String str = "echo 'he;llo' ; echo 'hello;;'";

    String splitStrArr[] = str.split("\' ; ");

    for (int i = 0; i < splitStrArr.length; i++) {
        System.out.println((i+1) + ")" + splitStrArr[i]);
    }

Output :

1)echo 'he;llo'

2)echo 'hello;;'

Abhishek
  • 3,348
  • 3
  • 15
  • 34
  • Okay but what if there is also space within the quotes? `echo 'he ; llo'` Apologies for not stating in the question – stackyyflow Feb 21 '17 at 03:59
  • Is you have some pattern like echo 'he ; llo' ; echo 'he ; llo;;' echo "blah blah ; " ? – Abhishek Feb 21 '17 at 04:09
  • In such case really simple solution is use regex "\' ; " to split the string as I edited my answer. But if you want some complex logic then I have to try. – Abhishek Feb 21 '17 at 04:10
1

Try using library OpenCSV like below

String cmd = "echo 'he ; llo' ; echo 'hello ; ; '";
CSVParser parser = new CSVParser(';', '\''); //params are separator, quoteChar
String[] strings = parser.parseLine(cmd);
eatSleepCode
  • 4,427
  • 7
  • 44
  • 93
1

I believe (without much evidence, I am afraid) that a split is too hard. However, it is possible to write a regexp to identify elements between semicolons, so you can find them all. A non-parser solution is this:

(?:(?:"(?:[^"\\]|\\[\\"])+")|(?:'(?:[^'\\]|\\[\\'])+')|[^'";])*

(remember to double every backslash if you put it into Java string literal, so \\ should become "\\\\").

However, I still maintain that for complex languages like this, a parser would be easier.

Community
  • 1
  • 1
Amadan
  • 191,408
  • 23
  • 240
  • 301
1

Please use the following regex to do splitting;

;(?=([^']*'[^']*')*[^']*$)

It will ignore any ; that is between ', thus your resulting arr will be;

[echo 'he;llo' ,  echo 'hello;;']

Thanks to -> https://stackoverflow.com/a/6464500/3641067

Community
  • 1
  • 1
buræquete
  • 14,226
  • 4
  • 44
  • 89
  • @stackyyflow if this answer was useful, and if you think it is the correct solution, please accept as the correct answer, thanks! – buræquete Feb 27 '17 at 05:51
1
Regex will match any string ending with a semicolon outside single quote

String stringToSearch = "echo 'he ; llo' ; echo 'hello;;'";

Pattern p1 = Pattern.compile("(?:[^\';]+|\'[^\']+\')+");
Matcher m = p1.matcher(stringToSearch);
while (m.find())
{   
    System.out.println(m.group());
}

output:

echo 'he ; llo'

echo 'hello;;'
Avaneesh
  • 162
  • 5