I am trying to split string in javascript by whitespaces, but ignoring whitespaces enclosed in quotes. So I googled this regular expression :(/\w+|"[^"]+"/g)
but the problem is, that this isn't working with accented chars like á etc. So please how should I improve my regular expression to make it work?
Asked
Active
Viewed 136 times
0

m3div0
- 1,556
- 3
- 17
- 32
-
Can the string include quotes nested within quotes? If so, regex may not be the way to go. See this previous answer: http://stackoverflow.com/questions/133601/can-regular-expressions-be-used-to-match-nested-patterns – Tim Goodman Sep 23 '12 at 14:26
-
no the quotes are used only to mark word that shouldn't be splitted, the problem is only with accented chars – m3div0 Sep 23 '12 at 14:29
-
@david, are you using `split` or `exec`. If you're using the former then that regular expression is not what you want and in that case you should use the latter – Alexander Sep 23 '12 at 14:38
3 Answers
1
That's because \w
only matches [A-Za-z0-9_]
. To match accented characters, add the unicode block range \x81-\xFF
which includes the Latin-1 characters à
and ã
, et cetera:
(/[\w\x81-\xFF]+|"[^"]+"/g)
There's also this site, which is very helpful to build the required unicode block range.

João Silva
- 89,303
- 29
- 152
- 158
1
This matches non-spaces that don't contain quotes, and matches text between quotes:
/[^\s"]+|"[^"]+"/g

Tim Goodman
- 23,308
- 7
- 64
- 83
0
If you want to match all non-whitespace characters instead of only alphanumeric ones, replace \w
with \S
.

Bergi
- 630,263
- 148
- 957
- 1,375
-
1If the string contains `"foo bar"` this will separately match `"foo` and `bar"`, whereas I think he'd want to match `"foo bar"`. I used `[^\s"]` in my answer to avoid this. – Tim Goodman Sep 23 '12 at 14:49
-