90

I need a way to have this:

"test, and test but not testing.  But yes to test".Replace("test", "text")

return this:

"text, and text but not testing.  But yes to text"

Basically I want to replace whole words, but not partial matches.

NOTE: I am going to have to use VB for this (SSRS 2008 code), but C# is my normal language, so responses in either are fine.

Vaccano
  • 78,325
  • 149
  • 468
  • 850
  • 1
    This is duplicated here I think: http://stackoverflow.com/questions/1209049/regex-match-whole-words – James Michael Hare May 26 '11 at 18:58
  • I guess the easiest way (possibly not the best way) would be to add a space at the beginning and end of the search term, for example, to replace whole words, search for: " drown " so it won't replace things such as " drowning ". – jay_t55 May 20 '13 at 22:05

8 Answers8

155

A regex is the easiest approach:

string input = "test, and test but not testing.  But yes to test";
string pattern = @"\btest\b";
string replace = "text";
string result = Regex.Replace(input, pattern, replace);
Console.WriteLine(result);

The important part of the pattern is the \b metacharacter, which matches on word boundaries. If you need it to be case-insensitive use RegexOptions.IgnoreCase:

Regex.Replace(input, pattern, replace, RegexOptions.IgnoreCase);
Ahmad Mageed
  • 94,561
  • 19
  • 163
  • 174
25

I've created a function (see blog post here) that wraps regex expression, suggested by Ahmad Mageed

/// <summary>
/// Uses regex '\b' as suggested in https://stackoverflow.com/questions/6143642/way-to-have-string-replace-only-hit-whole-words
/// </summary>
/// <param name="original"></param>
/// <param name="wordToFind"></param>
/// <param name="replacement"></param>
/// <param name="regexOptions"></param>
/// <returns></returns>
static public string ReplaceWholeWord(this string original, string wordToFind, string replacement, RegexOptions regexOptions = RegexOptions.None)
{
    string pattern = String.Format(@"\b{0}\b", wordToFind);
    string ret=Regex.Replace(original, pattern, replacement, regexOptions);
    return ret;
}
Michael Freidgeim
  • 26,542
  • 16
  • 152
  • 170
  • 9
    Remember to use `Regex.Escape()` on `wordToFind` so special characters are interpreted as regular characters. – CheeseSucker Mar 07 '16 at 10:01
  • @MichaelFreidgeim, Regex.Escape() makes a huge difference if wordToFind is more than alpha numeric. For example try searching for a masked cuss word, "!%@#\". It just won't work as expected. – Jroonk Jun 26 '20 at 22:57
  • @Jroonk , you are welcome to edit the post, if it improves the answer – Michael Freidgeim Jun 26 '20 at 23:03
8

As commented by Sga, the regex solution isn't perfect. And I guess not performance friendly too.

Here is my contribution :

public static class StringExtendsionsMethods
{
    public static String ReplaceWholeWord ( this String s, String word, String bywhat )
    {
        char firstLetter = word[0];
        StringBuilder sb = new StringBuilder();
        bool previousWasLetterOrDigit = false;
        int i = 0;
        while ( i < s.Length - word.Length + 1 )
        {
            bool wordFound = false;
            char c = s[i];
            if ( c == firstLetter )
                if ( ! previousWasLetterOrDigit )
                    if ( s.Substring ( i, word.Length ).Equals ( word ) )
                    {
                        wordFound = true;
                        bool wholeWordFound = true;
                        if ( s.Length > i + word.Length )
                        {
                            if ( Char.IsLetterOrDigit ( s[i+word.Length] ) )
                                wholeWordFound = false;
                        }

                        if ( wholeWordFound )
                            sb.Append ( bywhat );
                        else
                            sb.Append ( word );

                        i += word.Length;
                    }

            if ( ! wordFound )
            {
                previousWasLetterOrDigit = Char.IsLetterOrDigit ( c );
                sb.Append ( c );
                i++;
            }
        }

        if ( s.Length - i > 0 )
            sb.Append ( s.Substring ( i ) );

        return sb.ToString ();
    }
}

... With test cases :

String a = "alpha is alpha";
Console.WriteLine ( a.ReplaceWholeWord ( "alpha", "alphonse" ) );
Console.WriteLine ( a.ReplaceWholeWord ( "alpha", "alf" ) );

a = "alphaisomega";
Console.WriteLine ( a.ReplaceWholeWord ( "alpha", "xxx" ) );

a = "aalpha is alphaa";
Console.WriteLine ( a.ReplaceWholeWord ( "alpha", "xxx" ) );

a = "alpha1/alpha2/alpha3";
Console.WriteLine ( a.ReplaceWholeWord ( "alpha", "xxx" ) );

a = "alpha/alpha/alpha";
Console.WriteLine ( a.ReplaceWholeWord ( "alpha", "alphonse" ) );
Alexis Pautrot
  • 1,128
  • 1
  • 14
  • 18
  • 1
    @Alexis, You should rename the function as ReplaceWhitespaceSeparatedSubstrings. Also please provide expected output" comment for each of test cases. If you done any performance comparison to regex approach, please share them. – Michael Freidgeim May 29 '13 at 17:42
  • Just run the test cases to see output results. – Alexis Pautrot May 30 '13 at 10:10
  • 1
    This is not a 'white space separated' but a 'any char not a letter or number' separated. No I didn't made perf comparisons. – Alexis Pautrot May 30 '13 at 10:16
  • 2
    I've been working with it and found one fail: a = "4.99"; Console.WriteLine(a.ReplaceWholeWord("9", "8.99")); results in 4.98.99. In this context this looks like a silly example, but it illustrates a problem I am having on a real project. – Walter Williams Oct 29 '14 at 15:27
6

I just want to add a note about this particular regex pattern (used both in the accepted answer and in ReplaceWholeWord function). It doesn't work if what you are trying to replace isn't a word.

Here a test case:

using System;
using System.Text.RegularExpressions;
public class Test
{
    public static void Main()
    {
        string input = "doin' some replacement";
        string pattern = @"\bdoin'\b";
        string replace = "doing";
        string result = Regex.Replace(input, pattern, replace);
        Console.WriteLine(result);
    }
}

(ready to try code: http://ideone.com/2Nt0A)

This has to be taken into consideration especially if you are doing batch translations (like I did for some i18n work).

Sga
  • 3,608
  • 2
  • 36
  • 47
1

If you want to define what characters make up a word i.e. "_" and "@"

you could use my (vb.net) function:

 Function Replace_Whole_Word(Input As String, Find As String, Replace As String)
      Dim Word_Chars As String = "ABCDEFGHIJKLMNOPQRSTUVWYXZabcdefghijklmnopqrstuvwyxz0123456789_@"
      Dim Word_Index As Integer = 0
      Do Until False
         Word_Index = Input.IndexOf(Find, Word_Index)
         If Word_Index < 0 Then Exit Do
         If Word_Index = 0 OrElse Word_Chars.Contains(Input(Word_Index - 1)) = False Then
            If Word_Index + Len(Find) = Input.Length OrElse Word_Chars.Contains(Input(Word_Index + Len(Find))) = False Then
               Input = Mid(Input, 1, Word_Index) & Replace & Mid(Input, Word_Index + Len(Find) + 1)
            End If
         End If
         Word_Index = Word_Index + 1
      Loop
      Return Input
   End Function

Test

Replace_Whole_Word("We need to replace words tonight. Not to_day and not too well to", "to", "xxx")

Result

"We need xxx replace words tonight. Not to_day and not too well xxx"
Frank_Vr
  • 661
  • 7
  • 23
1

I don't like Regex because it is slow. My function is faster.

public static string ReplaceWholeWord(this string text, string word, string bywhat)
{
    static bool IsWordChar(char c) => char.IsLetterOrDigit(c) || c == '_';
    StringBuilder sb = null;
    int p = 0, j = 0;
    while (j < text.Length && (j = text.IndexOf(word, j, StringComparison.Ordinal)) >= 0)
        if ((j == 0 || !IsWordChar(text[j - 1])) &&
            (j + word.Length == text.Length || !IsWordChar(text[j + word.Length])))
        {
            sb ??= new StringBuilder();
            sb.Append(text, p, j - p);
            sb.Append(bywhat);
            j += word.Length;
            p = j;
        }
        else j++;
    if (sb == null) return text;
    sb.Append(text, p, text.Length - p);
    return sb.ToString();
}
palota
  • 465
  • 4
  • 8
0

This method also ignores the case if you are interested

public static string Replace(this string s, string word, string by, StringComparison stringComparison, bool WholeWord)
{
    s = s + " ";
    int wordSt;
    StringBuilder sb = new StringBuilder();
    while (s.IndexOf(word, stringComparison) > -1)
    {
        wordSt = s.IndexOf(word, stringComparison);
        if (!WholeWord || ((wordSt == 0 || !Char.IsLetterOrDigit(char.Parse(s.Substring(wordSt - 1, 1)))) && !Char.IsLetterOrDigit(char.Parse(s.Substring(wordSt + word.Length, 1)))))
        {
            sb.Append(s.Substring(0, wordSt) + by);
        }
        else
        {
            sb.Append(s.Substring(0, wordSt + word.Length));
        }
        s = s.Substring(wordSt + word.Length);
    }
    sb.Append(s);
    return sb.ToString().Substring(0, sb.Length - 1);
}
Draken
  • 3,134
  • 13
  • 34
  • 54
Honey22Sharp
  • 168
  • 1
  • 6
-5

You could use the string.replace

string input = "test, and test but not testing.  But yes to test";
string result2 = input.Replace("test", "text");
Console.WriteLine(input);
Console.WriteLine(result2);
Console.ReadLine();
Alex
  • 1
  • 8
    I am not an expert in C#, but how `replace` will not change `testing` to `texting` as is asked in the question? – King Midas Mar 21 '18 at 16:17