3

How to decode cp-1251 to UTF-8 in javascript?

The cp-1251 is from a datafeed, which required to decode from js client side.

There is no way to change server side output, since it is related to a 3rd party, and due to some reason, I would not use any server side programming to convert the datafeed to become another datafeed.

JSW189
  • 6,267
  • 11
  • 44
  • 72
user192344
  • 1,274
  • 6
  • 22
  • 36
  • Possible duplicate: http://stackoverflow.com/questions/2674411/convert-iso-windows-charsets-to-utf-8-in-javascript – Simon Boudrias Dec 17 '12 at 01:53
  • Sorry, but it has no duplicate, the question from me is pure JS only, no code page, no server side programming, the datafeed is from a web socket, not using xmlHTTPRequest – user192344 Dec 17 '12 at 01:59
  • You can easily reverse the function from this answer: http://stackoverflow.com/a/2711936/251311 – zerkms Dec 17 '12 at 02:18

1 Answers1

2

(Assuming that by "UTF-8" you meant the JS strings in their native encoding...)

Depending on the format your 'cp-1251' data is in and depending on the browsers you need to support, you can choose from:

  • TextDecoder.decode() API (decodes a sequence of octets from a typed array, like Uint8Array) - if you're using web sockets, you can get an ArrayBuffer out of it to decode.
  • https://github.com/mathiasbynens/windows-1251 operates on something it calls 'byte strings' (JS Strings consisting of characters like \u00XY, where 0xXY is the encoded byte.
  • build the decoding table yourself (example)

Note that in most cases (not something as low-level as websockets though) it might be easier to read the data in the correct encoding before it ends up as a JS string (for example, you can force XMLHttpRequest to use a certain encoding even if the server misreports the encoding).

Community
  • 1
  • 1
Nickolay
  • 31,095
  • 13
  • 107
  • 185