I am working on a web application that communicates with a PvPGN WebSocket server to receive messages. Messages contain both plain text and octal control sequences representing Unicode characters (for example, \320). The goal is to correctly decipher and display these messages containing Cyrillic characters. At the moment, the message looks like this:
Hello nickname, welcome to pvpgn-server
*******************************
Ðнлайн 1
*******************************
ERROR: ÐÑ Ð¿Ñоводим ÑÑÑниÑÑ, ÑозÑгÑÑÑи и квеÑÑÑ!
ERROR: ÐеÑеÑоди на ÑоÑÑм по баннеÑÑ ÑвеÑÑÑ.
ERROR: ÐÑÑппа VK vk.com/pvpgn-server
Chat topic: This is the public chat channel. Feel free to chat...
I've tried multiple approaches, including using the TextDecoder to handle UTF-8 encoding, as well as implementing a custom function to decode octal escape sequences into Unicode characters. Despite these attempts, I'm still encountering issues where the Cyrillic characters aren't displayed correctly on the webpage.
Here's version of the relevant part of my code:
function decodeOctalEscapes(input) {
return input.replace(/\\(\d{3})/g, (match, octal) => {
const charCode = parseInt(octal, 8);
return String.fromCharCode(charCode);
});
}
this.ws.onmessage = (event) => {
const reader = new FileReader();
reader.onload = () => {
const arrayBuffer = reader.result;
const textDecoder = new TextDecoder("utf-8");
const decodedMessage = textDecoder.decode(arrayBuffer);
const decodedText = decodeOctalEscapes(decodedMessage);
if (this.onMessageReceived) {
this.onMessageReceived(decodedText);
}
};
reader.readAsArrayBuffer(event.data);
};