1

Same question as this, but with UTF-8 instead of ASCII

In JavaScript, how can you get a string representation of a UTF-8 value?

e.g. how to turn "c385" into "Å" ?

or how to turn "E28093" into "—" (m dash) ?

or how to turn "E282AC" into "€" (euro sign) ?

My question is NOT a duplicate of Hex2Asc. You can see for yourself: hex2a("E282AC") will transform the string into "â¬" instead of transforming it into "€" (euro sign) !!

Community
  • 1
  • 1
BearCode
  • 2,734
  • 6
  • 34
  • 37
  • Take a look at this: http://stackoverflow.com/questions/834316/how-to-convert-large-utf-8-strings-into-ascii – Cᴏʀʏ Jun 12 '13 at 04:04
  • Actually the same answer. However, I wonder what char code *"c3 85"* would represent? And `\u00c3` is `Ã`, not `Å`. – Bergi Jun 12 '13 at 04:08
  • 1
    It's not the same question at all. In fact, the question I quoted is the same with the one you pointed at. Wikipedia: In UTF-8 the hexadecimal representation of Å is "c3 85". The answer there it will transform the string into another character: Ã – BearCode Jun 12 '13 at 04:21
  • Or it will transform "E28093" into "â", instead of transforming it into "—" (m dash) – BearCode Jun 12 '13 at 04:28
  • Here's a simple algorithm to do what you wish: http://jsfiddle.net/consultcory/9K6th/2/. If this question gets reopened I'll post it as an answer. – Cᴏʀʏ Jun 12 '13 at 13:25

2 Answers2

3

I think this will do what you want:

function convertHexToString(input) {

    // split input into groups of two
    var hex = input.match(/[\s\S]{2}/g) || [];
    var output = '';

    // build a hex-encoded representation of your string
    for (var i = 0, j = hex.length; i < j; i++) {
        output += '%' + ('0' + hex[i]).slice(-2);
    }

    // decode it using this trick
    output = decodeURIComponent(output);

    return output;
}

console.log("'" + convertHexToString('c385') + "'");   // => 'Å'
console.log("'" + convertHexToString('E28093') + "'"); // => '–'
console.log("'" + convertHexToString('E282AC') + "'"); // => '€'

DEMO

Credits:

Community
  • 1
  • 1
Cᴏʀʏ
  • 105,112
  • 20
  • 162
  • 194
1
var hex = "c5";
String.fromCharCode(parseInt(hex, 16));

you have to use c5, not c3 85 ref: http://rishida.net/tools/conversion/

Lear more about code point and code unit

  1. http://en.wikipedia.org/wiki/Code_point
  2. http://www.coderanch.com/t/416952/java/java/Unicode-code-unit-Unicode-code
Diode
  • 24,570
  • 8
  • 40
  • 51
  • thanks but I need to convert from UTF-8, not from ASCII. C5 is the ASCII code and C3 85 is the UTF-8 code. Most of the characters are not encoded in ASCII but all of them are encoded in Unicode (and in UTF-8) – BearCode Jun 12 '13 at 04:40