3

I have a following string:

Hello word!!!

or

Hello world:)

Now I want to split this string to an array of string which contains Hello,world,!,!,! or Hello,world,:)

the problem is if there was space between all the parts I could use split(" ") but here !!! or :) is attached to the string

I also used this code :

String Text = "But I know. For example, the word \"can\'t\" should";

String[] Res = Text.split("[\\p{Punct}\\s]+");
System.out.println(Res.length);
for (String s:Res){
    System.out.println(s);
}

which I found it from here but not really helpful in my case: Splitting strings through regular expressions by punctuation and whitespace etc in java

Can anyone help?

Community
  • 1
  • 1
HMdeveloper
  • 2,772
  • 9
  • 45
  • 74

1 Answers1

2

Seems to me like you do not want to split but rather capture certain groups. The thing with split string is that it gets rid of the parts that you split by (so if you split by spaces, you don't have spaces in your output array), therefore if you split by "!" you won't get them in your output. Possibly this would work for capturing the things that you are interested in:

(\w+)|(!)|(:\))/g

regex101
Mind you don't use string split with it, but rather exec your regex against your string in whatever engine/language you are using. In Java it would be something like:

String input = "Hello world!!!:)";

Pattern p = Pattern.compile("(\w+)|(!)|(:\))");
Matcher m = p.matcher(input);

List<String> matches = new ArrayList<String>();
while (m.find()) {
    matches.add(m.group());
}

Your matches array will have:

["Hello", "world", "!", "!", "!", ":)"]
Daniel Gruszczyk
  • 5,379
  • 8
  • 47
  • 86
  • Perfect just a question in above code where did you split on space because the above code works well with space as well? – HMdeveloper Sep 02 '15 at 15:11
  • you don't have to split by spaces, you only "capture" what you are interested in, like capture words, ! or :) – Daniel Gruszczyk Sep 02 '15 at 15:13
  • But between hello and world there is space and I want hello and world separated as well which program perfectly does, so there should be a sign somewhere in your regx which does that – HMdeveloper Sep 02 '15 at 15:15
  • 1
    nope, my regex has 3 groups: (\w) which selects whole words, (!) which selects any single "!", and (:\)) which selects any ":)". I do not split, but I rather say "find me next one". So the loop iterates 6 times, first time it finds "Hello", 2nd is "world" etc... you see the trick now? Any spaces are just skipped :) – Daniel Gruszczyk Sep 02 '15 at 15:19
  • Many likeeeeeeeeeeeeeeeees :) – HMdeveloper Sep 02 '15 at 15:24