1

I have issues with removing accents from a text file program replaces characters with diacritics to ? Here is my code:

        private void button3_Click(object sender, EventArgs e)
        {

           if (radioButton3.Checked)
            {
                byte[] tmp;
                tmp = System.Text.Encoding.GetEncoding("ISO-8859-1").GetBytes(richTextBox1.Text);
                richTextBox2.Text = System.Text.Encoding.UTF8.GetString(tmp);


            }

        }
printline
  • 9
  • 2
  • 9
  • small example please "of removing accents" – Uthistran Selvaraj May 06 '16 at 10:12
  • input včľťšľžšžščýščýťčáčáčťáčáťýčťž -> vcltslzszsc�sc�tc�c�ct�c�t�ctz – printline May 06 '16 at 10:15
  • Possible duplicate of [How do I remove diacritics (accents) from a string in .NET?](https://stackoverflow.com/questions/249087/how-do-i-remove-diacritics-accents-from-a-string-in-net) – NH. Oct 09 '18 at 22:57

2 Answers2

3

Taken from here: https://stackoverflow.com/a/249126/3047078

static string RemoveDiacritics(string text)
{
  var normalizedString = text.Normalize(NormalizationForm.FormD);
  var stringBuilder = new StringBuilder();

  foreach (var c in normalizedString)
  {
    var unicodeCategory = CharUnicodeInfo.GetUnicodeCategory(c);
    if (unicodeCategory != UnicodeCategory.NonSpacingMark)
    {
      stringBuilder.Append(c);
    }
  }

  return stringBuilder.ToString().Normalize(NormalizationForm.FormC);
}

usage:

string result = RemoveDiacritics("včľťšľžšžščýščýťčáčáčťáčáťýčťž");

results in vcltslzszscyscytcacactacatyctz

Community
  • 1
  • 1
Flat Eric
  • 7,971
  • 9
  • 36
  • 45
2
richTextBox1.Text = "včľťšľžšžščýščýťčáčáčťáčáťýčťž";            

string text1 = richTextBox1.Text.Normalize(NormalizationForm.FormD);

string pattern = @"\p{M}";
string text2 = Regex.Replace(text1, pattern, "�");

richTextBox2.Text = text2;

First normalize the string.
Then with a regular expression replace all diacritics. Pattern \p{M} is Unicode Category - All diacritic marks.

Alexander Petrov
  • 13,457
  • 2
  • 20
  • 49