0

How can I split a string based on a string and have the resulting array contain the separators as well?

Example:

If my string = "Hello how are you.Are you fine?.How old are you?"

And I want to split based on string "you", then result I want is an array with items { "Hello how are", "you", ".Are", "you", "fine?.How old are", "you", "?" }.

How can I get a result like this? I tried String.Split and

string[] substrings = Regex.Split(source, stringSeparators);

But both are giving the result array without the occurrence of you in it.

Also, I want to split only on the whole word you. I don't want to split if you is a part of some other words. For example, in the case Hello you are so young, I want the result as { "Hello", "you", "so young" }. I don't want to split the word young to { "you", "ng" }.

ErikE
  • 48,881
  • 23
  • 151
  • 196
Sebastian
  • 4,625
  • 17
  • 76
  • 145
  • 1
    possible duplicate of [C# split string but keep split chars / separators](http://stackoverflow.com/questions/521146/c-sharp-split-string-but-keep-split-chars-separators) – Ondrej Janacek Dec 03 '14 at 06:28
  • @Ondrej Its not a possible duplicate . I want to try for words only . See the edit – Sebastian Dec 03 '14 at 06:35

4 Answers4

4

You can put the seperator into a match group, then it will be part of the result array:

string[] substrings = System.Text.RegularExpressions.Regex.Split(source, "(you)");

Output would be :

"Hello how are ", 
"you" ,
".Are ",
"you",
" fine?.How old are ",
"you",
"?"

Update regarding your additional question: Use word-boundaries around the keyword:

Split(source, "\\b(you)\\b");
dognose
  • 20,360
  • 9
  • 61
  • 107
  • Also i want to split only for the word "you". I dont want to split ifyou is a part of some other words . Example Hello you are so young . In this case i want result as "Hello" "you" "so young" . Dont want to split the word young to "you" and "ng" – Sebastian Dec 03 '14 at 06:36
  • ie i want to split either by "you" or " you " or " you" or "you " . [ without space , with space , with space in left / right etc] – Sebastian Dec 03 '14 at 06:37
  • string[] substrings = System.Text.RegularExpressions.Regex.Split("Hello how are you.Are you fine?.How old are you?.Hello how old are you?.Hello you are so young", "(you)"); int s = substrings.Length; substrings = System.Text.RegularExpressions.Regex.Split("Hello how are you.Are you fine?.How old are you?.Hello how old are you?.Hello you are so young", "\b(you)\b"); s = substrings.Length; i tried the suggestion , but first one gives some result. Second one not returning anything – Sebastian Dec 03 '14 at 07:04
  • @JMat My bad, you ofc. need to escape the `\b` in the pattern, see the edit. – dognose Dec 03 '14 at 07:17
  • Yes Its perfect now. Just one more case to handle Case sensitiveness? How can i achieve that . Now if You is present in string it skips them since Y is in caps – Sebastian Dec 03 '14 at 07:37
  • @JMat without providing `RegexOptions.IgnoreCase` as third parameter, the splitting is already case-sensitive. – dognose Dec 03 '14 at 08:32
1
\b(you)\b

Split by this and you have your result.

vks
  • 67,027
  • 10
  • 91
  • 124
  • string[] substrings = System.Text.RegularExpressions.Regex.Split("Hello how are you.Are you fine?.How old are you?.Hello how old are you?.Hello you are so young", "(you)"); int s = substrings.Length; substrings = System.Text.RegularExpressions.Regex.Split("Hello how are you.Are you fine?.How old are you?.Hello how old are you?.Hello you are so young", "\b(you)\b"); s = substrings.Length; i tried the suggestion , but first one gives some result. Second one not returning anything – Sebastian Dec 03 '14 at 07:04
0

regex replace :

(you) with |\1|

now you will have a string like this :

Hello how are |you|.Are |you| fine?.How old are |you|?

now you can simply split on |

Hope that helps

aelor
  • 10,892
  • 3
  • 32
  • 48
0

string[] substrings = System.Text.RegularExpressions.Regex.Split(source,"\s* you\s*");

This should work. Below is the output.

"Hello how are"

".Are"

"fine?.How old are"

"?"

praveen.upadhyay
  • 273
  • 1
  • 3
  • 14