1

The users are entering data in different fonts(most probably by copy-pasting data with fonts) but I want to store them as regular data. My column type is nvarchar.

The data appears like this:

 
ᴩᴀᴅᴍᴀᴍᴍᴀ ᴛ

And if I query on ColumnName == "", it return these with empty strings.

How can I save them and read them as normal SQL text?

Steps to reproduce : Insert some text having font to a nvarchar column.

C_User user;
user = new C_User
{
    Name = input.FirstName + (string.IsNullOrEmpty(input.LastName) ? "" : " " + input.LastName),
    EntityType = typeof(C_Patient).Name
};
context.C_User.Add(user);
Charu
  • 40
  • 4
  • 5
    Please share a [mcve]. – mjwills May 26 '21 at 06:48
  • 1
    strings don't come with a font "attached" to them. How and where are users "entering" that data? The font should be completely a "UI-Thing". If you needed to remember it, you'd have to explicitly store it along with the text data. – Fildor May 26 '21 at 07:07
  • The first part of input string is created by using some interesting technique, it's not "Font" what is the problem, it's the user who is entering such symbols. They could enter to example "ʇxǝʇ uʍop ǝpᴉsd∩", good luck with converting that font. Perhaps you need to validate input, restricting it to a certain ascii range? – Sinatr May 26 '21 at 07:11
  • 1
    No, it's not "copy-pasting with font", there is no font info unless you store strings in a certain format (like RTF). It's on-pupose entering certain characters using unicode table (e.g. with the help of online [tools](https://www.upsidedowntext.com/)). – Sinatr May 26 '21 at 07:14
  • @Fildor I am able to reproduce this by copy-pasting data that has some font into the input box and then saving it. – Charu May 26 '21 at 07:16
  • As Sinatr says: It's not the Font. The Font only supports the unicode range. If you used another Font, you'd probably get some weird squares ... but alas. We cannot help without seeing any of your code. – Fildor May 26 '21 at 07:20
  • Added code @Fildor – Charu May 26 '21 at 07:33
  • Added Code @Sinatr – Charu May 26 '21 at 07:33
  • As has already been expressed in the comments; this isn't "font" data - it is unicode code-points / grapheme-clusters outside of the ASCII range. Honestly, your best approach here would be to restrict this *on the way in* rather than trying to handle it later; related reading: https://stackoverflow.com/questions/3403877/find-similar-ascii-character-in-unicode – Marc Gravell May 26 '21 at 07:35
  • Is there a way I could handle it on the frontend? (I am using vue) – Charu May 26 '21 at 07:49
  • ` ` is Unicode `\U0001d445\U0001d4b6\U0001d4c2…`; you can [normalize](https://learn.microsoft.com/en-us/dotnet/api/system.string.normalize?view=net-5.0) it (apply `FormKC` or `FormKD`). However, you can't normalize `ᴩᴀᴅᴍᴀᴍᴍᴀ ᴛ` which is a mix of Greek and latin letters (Unicode `\u1d29\u1d00\u1d05\u1d0d\u1d00\u1d0d\u1d0d\u1d00 \u1d1b`) e.g. `ᴩ` (U+1D29) _Greek Letter Small Capital Rho_, `ᴀ` (U+1D00) _Latin Letter Small Capital A_,… – JosefZ May 26 '21 at 12:35
  • @JosefZ Thanku so much, 90% of the data got corrected with normalizing. – Charu May 26 '21 at 22:38

0 Answers0