0

A text saved in our database was causing problems. After some investigation, I found out that the text was the following:

var text = '"\ud835\ude45\ud835\ude6a\ud835\ude68\ud835\ude69\ud835\ude5e\ud835\ude5b\ud835\ude5e\ud835\ude66\ud835\ude6a\ud835\ude5a \ud835\ude68\ud835\ude6a\ud835\ude56\ud835\ude68 \ud835\ude67\ud835\ude5a\ud835\ude68\ud835\ude65\ud835\ude64\ud835\ude68\ud835\ude69\ud835\ude56\ud835\ude68 \ud835\ude5e\ud835\ude63\ud835\ude59\ud835\ude5e\ud835\ude58\ud835\ude56\ud835\ude63\ud835\ude59\ud835\ude64 \ud835\ude64\ud835\ude68 \ud835\ude58\ud835\ude56\u0301\ud835\ude61\ud835\ude58\ud835\ude6a\ud835\ude61\ud835\ude64\ud835\ude68 \ud835\ude67\ud835\ude5a\ud835\ude56\ud835\ude61\ud835\ude5e\ud835\ude6f\ud835\ude56\ud835\ude59\ud835\ude64\ud835\ude68 \ud835\ude5a \ud835\ude5b\ud835\ude64\ud835\ude69\ud835\ude64\ud835\ude5c\ud835\ude67\ud835\ude56\ud835\ude5b\ud835\ude5a \ud835\ude69\ud835\ude64\ud835\ude59\ud835\ude64 \ud835\ude64 \ud835\ude65\ud835\ude67\ud835\ude64\ud835\ude58\ud835\ude5a\ud835\ude68\ud835\ude68\ud835\ude64 \ud835\ude65\ud835\ude56\ud835\ude67\ud835\ude56 \ud835\ude65\ud835\ude64\ud835\ude63\ud835\ude69\ud835\ude6a\ud835\ude56\ud835\ude67 \ud835\ude63\ud835\ude56 \ud835\ude66\ud835\ude6a\ud835\ude5a\ud835\ude68\ud835\ude69\ud835\ude56\u0303\ud835\ude64. \ud835\ude47\ud835\ude5a\ud835\ude62\ud835\ude57\ud835\ude67\ud835\ude5a-\ud835\ude68\ud835\ude5a \ud835\ude59\ud835\ude5a \ud835\ude56\ud835\ude63\ud835\ude5a\ud835\ude6d\ud835\ude56\ud835\ude67 \ud835\ude68\ud835\ude6a\ud835\ude56 \ud835\ude5b\ud835\ude64\ud835\ude69\ud835\ude64"';

I know that it can be read with decodeURIComponent(JSON.parse(text)), which outputs:

"     ́          ̃. -    "

I have two questions: how can I convert that text to

"Justifique suas respostas indicando os cálculos realizados e fotografe todo o processo para pontuar na questão. Lembre-se de anexar sua foto"

And what is the most probable way that it was written in a textarea and saved in my DB?

Please notice that:

'' === 'Justifique'  // returns false
Edhowler
  • 715
  • 8
  • 17
  • 1
    `decodeURIComponent()` has nothing to do with dealing with character set issues. – Pointy Mar 22 '21 at 17:51
  • What DB are you using? What format is it currently at? – MertDalbudak Mar 22 '21 at 17:51
  • We are using postgres, but I think it has no relation with the problem. The DB just saved what was written by a user at the frontend. – Edhowler Mar 22 '21 at 18:32
  • 1
    Does this answer your question? [What's the point of String.normalize()?](https://stackoverflow.com/questions/63013552/whats-the-point-of-string-normalize) Try `.normalize('NFKC')` or `.normalize('NFKD')`. – JosefZ Mar 22 '21 at 19:10
  • Yes, it answers in part: ''.normalize('NFKC') === 'Justifique' returns true. Only the second question left to answer. – Edhowler Mar 22 '21 at 19:19
  • NFKD is enough in this case. "NFC": Canonical Decomposition, followed by Canonical Composition. "NFD": Canonical Decomposition. "NFKC": Compatibility Decomposition, followed by Canonical Composition. "NFKD": Compatibility Decomposition. – Edhowler Mar 22 '21 at 19:30

0 Answers0