-4

I have a string where all words separated with these chars:

{ ' ', '.', ',', '!', '?', ':', '-','\r','\n' };

And words can be separated MORE then one separator; I tried replace all separators on spaces and add space to start and end of text and in loop find

IndexOf(" "+word+" ",i+word.Length)

but i think exists more faster ways to make it

Daxak
  • 17
  • 4
  • 1
    [`String.IndexOfAny`](https://learn.microsoft.com/en-us/dotnet/api/system.string.indexofany) maybe? – Uwe Keim Jul 24 '21 at 18:19
  • 3
    I smell an [XY problem](https://xyproblem.info/) here. Q: What are you *REALLY* trying to accomplish? What do you want to do once you've found all the delimiters in your string? Is this just a simple [parsing](https://stackoverflow.com/q/858756/421195) problem? You might want to consider using [String.Split()](https://learn.microsoft.com/en-us/dotnet/csharp/how-to/parse-strings-using-split), or a [regex](https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expressions). – paulsm4 Jul 24 '21 at 18:21
  • Side note: the only thing you can achieve by reposting the same downvoted question again and again is to get question ban on the account. At very least you should read comments to previous versions of the same post and answer in this one. – Alexei Levenkov Jul 24 '21 at 18:37

1 Answers1

1

You can search for words using regex without having to specify all possible separators.

string input = "This, is? a test-word!\r\nanother line.";
var matches = Regex.Matches(input, @"\w+");
foreach (Match m in matches) {
    Console.WriteLine($"\"{m.Value}\" at {m.Index}, length {m.Length}");
}

prints:

"This" at 0, length 4
"is" at 6, length 2
"a" at 10, length 1
"test" at 12, length 4
"word" at 17, length 4
"another" at 24, length 7
"line" at 32, length 4

The expression \w+ specifies a sequence of one or more word characters. This includes letters, digits and the underscore. See Word Character: \w for a detailed description of \w.


You can replace all (possibly multiple) separators by spaces like this:

char[] separators = new char[] { ' ', '.', ',', '!', '?', ':', '-', '\r', '\n' };
var words = input.Split(separators, StringSplitOptions.RemoveEmptyEntries);
string result = String.Join(" ", words);
Console.WriteLine(result);

prints

This is a test word another line

The StringSplitOptions.RemoveEmptyEntries parameter ensures that sequences of multiple separators are treated like one separator.

Olivier Jacot-Descombes
  • 104,806
  • 13
  • 138
  • 188