-2

I need to check if a string contains a diacritic, so if I have a name "Kateřina" I need to return true and if I have "Jana" I need false. For both values I get false now. Please I don't want to remove them I want to keep them so string normalize won't do, basically I need to check if string has these ěščřžýáíé

static void Main(string[] args)
        {
            Console.WriteLine("Hello World!");

            string name = "Jana";

            string name2 = "Kateřina";

            if (ExTest.DiacriticCheck(name))
            {
                Console.WriteLine(name);
            }
            if (ExTest.DiacriticCheck(name2))
            {
                Console.WriteLine(name2);
            }

        } 

public static bool DiacriticCheck(string text)
            {
                if (Regex.IsMatch(text, @"^[\p{L}\p{N}\p{Zs}_-]+$ˇ") == false)
                {
                     return false;
                }
           
                return true;
            }
  • see https://stackoverflow.com/questions/8923729/checking-for-diacritics-with-a-regular-expression – jps Feb 18 '21 at 08:33
  • 1
    Does this answer your question? [How to check if Unicode character has diacritics in .Net?](https://stackoverflow.com/questions/9349608/how-to-check-if-unicode-character-has-diacritics-in-net) – prospector Feb 18 '21 at 08:44

3 Answers3

1

You can check if the string IsNormalized().

Another simple way to achieve this is to convert the text to ASCII 7 bits that contains only non-diacrits characters, then compare it back with the original value. This could be helpfull is you actually need only ASCII characters later in your program.

In the following code, commented part is the ASCII approach:

    static bool DiacriticCheck(string text)
    {
        //byte[] bytes = Encoding.UTF8.GetBytes(text);
        //string textAscii = Encoding.ASCII.GetString(bytes);
        //return text != textAscii;
        return !text.IsNormalized(NormalizationForm.FormD);
    }
0
public static bool DiacriticCheck(string text) => Regex.IsMatch(text, @"[ěščřžýáíé]");
YHF
  • 53
  • 1
  • 10
0

Unfortunately diacritics can only manually be find and replaced. There is no golden flag that tells you this is a diacritcs.

All you can do, is playing around with converting between different encodings and compare the outcoming strings or holding somewhere a big table with diacritic characters (or directly a dictionary with the desired replacement).

For example the project Diacritics.Net just holds such an dictionary (or a bunch for each language) and checks if a character occurs there or not.

Or you could create such a dictionary by yourself. A good starting point could also be this code.

Oliver
  • 43,366
  • 8
  • 94
  • 151
  • as I said I don't want it to be replaced, I have seen Diacritics.Net but it seems it doesn't work with Xamarin.forms – Tetsa Popovicova Feb 18 '21 at 08:46
  • The mapper in this project has also a method `bool HasDiacritics(string source)` (which internally just runs `Remove()` and compares the outcoming string with the input). – Oliver Feb 18 '21 at 08:48
  • If this project doesn't work with Xamarin.forms, than this is a different question. But I look through the code of the project and it uses just base stuff that's available since .Net 3.5, so it should work in Xamarin (but I don't know). – Oliver Feb 18 '21 at 08:52