1

How can I implement regular expression to String.split() to seperate values by spaces and ignore double quoted text?

Like in the below example.

hello "Luis Anderson" your age is 30 and u will get $30

this, list of strings:

'hello', '"Luis Anderson"', 'your', 'age', 'is', '30', 'and', 'u', 'will', 'get', '$30'

The problem is that when I'm using String.split(), it takes in consideration also phrase between "Luis Enderson" and spliting it in 2 strings.

If you got any other ideas which not include usage of regular expression please explain it, thanks.

SIMILAR QUESTION how to split string by space but escape spaces inside quotes (in java)?

Community
  • 1
  • 1
user1768615
  • 281
  • 1
  • 5
  • 12
  • I'm not so good with regular expression but I'm trying to read line, like u see on example, and split it to get tokens, the problem is that when im spliting by space it take all spaces and i dont want consider spaces in "phrase", my attemp of regex "/\"[\w]+\"\" – user1768615 Jun 01 '13 at 16:34
  • 1
    See http://stackoverflow.com/questions/8945113/how-to-split-string-by-space-but-escape-spaces-inside-quotes-in-java – Thihara Jun 01 '13 at 16:36
  • @Thihara, thanks you. I did search for it but didnt see anywhere, maybe bcuz of syntax question, thanks. – user1768615 Jun 01 '13 at 16:39
  • @user1768615 In question provided by Thihara OP was trying to split on spaces surrounded only by quotation marks like `" "`. Your question seems to be different. – Pshemo Jun 01 '13 at 17:01
  • @Pshemo, but it is very similar and I dont want spam stackoverflow with similar question, but if u consider it different I will edite it, just confirm me it, please – user1768615 Jun 01 '13 at 17:09
  • @user1768615 In my opinion your question is different, but if you are able to create solution using that question then I don't mind leaving your edit as it is :) – Pshemo Jun 01 '13 at 17:16
  • @Pshemo, alright so i take edite part off, maybe for somebody it will be useful. – user1768615 Jun 01 '13 at 17:24
  • You could use a csv parser using space as a separator. – assylias Jun 01 '13 at 17:40

2 Answers2

2

If it doesn't have to be regex, then you can do it in one iteration over String characters.

String data = "hello \"Luis Anderson\" your age is 30 and u will get $30";

List<String> tokens = new ArrayList<String>();
StringBuilder sb = new StringBuilder();
boolean insideQuote = false;

for (char c : data.toCharArray()) {
    if (c == '"')
        insideQuote = !insideQuote;
    if (c == ' ' && !insideQuote) {
        tokens.add(sb.toString());
        sb.delete(0, sb.length());
    } else
        sb.append(c);
}
tokens.add(sb.toString());// last word

System.out.println(tokens);

output: [hello, "Luis Anderson", your, age, is, 30, and, u, will, get, $30]

Pshemo
  • 122,468
  • 25
  • 185
  • 269
  • This is what I would have done, but only because my regex-fu is not as strong as Kent's. – Edward Falk Jun 01 '13 at 17:28
  • Ehm, this is nice piece of code but I'm wondering if its possible to make with regular expression to save space and make it on more gentile way. – user1768615 Jun 01 '13 at 18:02
  • @user1768615 If there are no nested quotes like `a "b "c d" e" f` then it is possible. You can try this way `split("\\s(?!(\\S+\\s+)*\\S+\")|\\s(?=\")")`. If nested quotes are possible (at any level) then regex is not best tool for this. – Pshemo Jun 01 '13 at 18:24
2
String s = "hello \"Luis Anderson\" your age is 30 and u will get $30";
        Pattern p = Pattern.compile("(?<=\\s|^)(\".*?\"|\\S*)(?=$|\\s)");
        Matcher m = p.matcher(s);
        while (m.find()) {
            System.out.println(m.group(1));
        }

outputs:

hello
"Luis Anderson"
your
age
is
30
and
u
will
get
$30

you can handle the text in array or List, or whatever

Kent
  • 189,393
  • 32
  • 233
  • 301