0

I have these codes below where it can identify the bad words (those words are stored in the database) you've entered with the Web Browser Control and turn it into asterisk (*). I have been struggling with case sensitive in which you can enter either lower case or upper case (example: HeLlo)

    string query;
    query = @"select Word from ListWords";

    List<string> words = new List<string>();

    DataSet ds;
    DataRow drow;

    ds = DatabaseConnection.Connection1(query);
    int index, total;

    total = ds.Tables[0].Rows.Count;

    string current_word;

    for (index = 0; index < total; index++ )
    {
        drow = ds.Tables[0].Rows[index];
        current_word = drow.ItemArray.GetValue(0).ToString();

        words.Add(current_word);
    }

    Console.WriteLine(query);


    Console.WriteLine("array:" + words);
    foreach (String key in words)
    {
        String substitution = "<span style='background-color: rgb(255, 0, 0);'>" + key + "</span>";

        int len = key.Length;
        string replace = "";

        for ( index = 0; index < len; index++)
        {
            replace += "*";
        }

        html.Replace(key, replace);
        //count++;
    }


    doc2.body.innerHTML = html.ToString();
}
Sach
  • 10,091
  • 8
  • 47
  • 84
Kyte
  • 7
  • 4
  • 1
    It's not exactly clear what you're asking in this question. Thank you for providing a code example, but to better answer, we will need to know where exactly it is failing. – Victor Procure Sep 07 '18 at 16:17

3 Answers3

0

If I understand you correctly, you want to search the html string for words from your filter list, and replace them with some HTML coded string plus * in place of the 'bad words'.

Regex are a great solution for this.

So let's say your have a word list like this:

List<string> badWords = new List<string>
{
    "Damn",
    "Hell",
    "Idiot"
};

And this is your HTML.

var html = "You're a damn idIOT!!";

OK not a lot of HTML in that but bear with me.

Now you iterate through the word list, and we create a Regex for each word with ignoring case. Then depending on the length of the word, we create a replacement string. Then call Regex.Replace().

foreach (var word in badWords)
{
    Regex rgx = new Regex(word, RegexOptions.IgnoreCase);
    var blocked = new string('*', word.Length);
    var replacement = "<span style='background-color: rgb(255, 0, 0);'>" + blocked + "</span>";
    html = rgx.Replace(html, replacement);
}

Edit

Also, you don't really need to reinvent the wheel. Here is a great SO post about profanity filters.

Sach
  • 10,091
  • 8
  • 47
  • 84
0

Try normalizing the input word with current_word.ToLower() before adding them to the list.

MSDN has more info on this. https://learn.microsoft.com/en-us/dotnet/api/system.string.tolower?view=netframework-4.7.2

0

A simplified approach would be to use the Regex.Replace method, which you can pass a flag to ignore case.

Here's an example using a List<string> of "bad words", and how it could be used. The downside is that if a word contains a bad word, that part of the word will also be redacted.

var badWords = new List<string>
{
    "Bleeping",
    "Bad"
};

var html = "This is my bleeping html file with bad words in it!\n" + 
        "But realize it will replace partial occurrences, too,\n" +
        "for example, now I can't write BADGER!";

Console.WriteLine("Old html:\n" + html + Environment.NewLine);

foreach (var badWord in badWords)
{
    html = Regex.Replace(html, badWord, new string('*', badWord.Length), RegexOptions.IgnoreCase);
}

Console.WriteLine("New html:\n" + html);

Output enter image description here

Rufus L
  • 36,127
  • 5
  • 30
  • 43