0

I can't get this to work..

I have an String which I want to split on spaces. However, I do not want to split inside Strings. That is, text which is inside double or single quotes.

Example

Splitting the following string:

private String words = " Hello, today is nice " ;

..should produce the following tokens:

 private
 String
 words
 =
 " Hello, today is nice "
 ;

What kind of regex can I use for this?

jpaw
  • 51
  • 1
  • 10
  • Shouldn't this work? "[^\\s\"']+|\"[^\"]*\"|'[^']*'" – jpaw Apr 03 '12 at 14:02
  • Duplicate of [this](http://stackoverflow.com/questions/366202/regex-for-splitting-a-string-using-space-when-not-surrounded-by-single-or-double) – Boosty Apr 03 '12 at 14:09
  • was looking at it but thought it was different. now i realize it's the same question. sorry! – jpaw Apr 03 '12 at 15:23

2 Answers2

0

The regex ([^ "]*)|("[^"]*") should match all the tokens. Drawing on my limited knowledge of Java and http://www.regular-expressions.info/java.html, you should be able to do something like this:

// Please excuse any syntax errors, I'm used to C#
Pattern pattern = Pattern.compile("([^ \"]*)|(\"[^\"]*\")");
Matcher matcher = pattern.matcher(theString);
while (matcher.find())
{
    // do something with matcher.group();
}
Kendall Frey
  • 43,130
  • 20
  • 110
  • 148
0

Have you tried this?

((['"]).*?\2|\S+)

Here is what it does:

(         <= Group everything
  (['"])  <= Find a simple or double quote
  .*?     <= Capture everything after the quote (ungreedy)
  \2      <= Find the simple or double quote (same as we had before)
  |       <= Or
  \S+     <= Non space characters (one at least)
)

On another note, if you want to create a parser, do a parser and don't use regexes.

Colin Hebert
  • 91,525
  • 15
  • 160
  • 151
  • Tried this.. but it doesn't extract any tokens at all for some reason.. perhaps not suitable for the split method? String[] tokens = myString.get(x).split("((['\"]).*?\\2|\\S+)"); – jpaw Apr 04 '12 at 08:13