Questions tagged [unicode-escapes]

Use this tag for questions related to Unicode Escapes, a Unicode character Escape sequence represents a Unicode character.

Quoting the MSDN page:

A Unicode escape sequence represents the single Unicode character formed by the hexadecimal number following the "\u" or "\U" characters. Since C# uses a 16-bit encoding of Unicode code points in characters and string values, a Unicode character in the range U+10000 to U+10FFFF is not permitted in a character literal and is represented using a Unicode surrogate pair in a string literal. Unicode characters with code points above 0x10FFFF are not supported.

Notice that is used in its general meaning, thus you are encouraged to tag your question with the corresponding programming environment as well.

318 questions
390
votes
1 answer

Placing Unicode character in CSS content value

I have a problem. I have found the HTML code for the downwards arrow, ↓ (↓) Cool. Now I need to use it in CSS like so: nav a:hover {content:"&darr";} That obviously won't work since ↓ is an HTML symbol. There seems to be less info about…
davecave
  • 4,698
  • 6
  • 26
  • 32
53
votes
12 answers

Convert International String to \u Codes in java

How can I convert an international (e.g. Russian) String to \u numbers (unicode numbers) e.g. \u041e\u041a for OK ?
ehsun7b
  • 4,796
  • 14
  • 59
  • 98
42
votes
1 answer

Removing unicode \u2026 like characters in a string in python2.7

I have a string in python2.7 like this, This is some \u03c0 text that has to be cleaned\u2026! it\u0027s annoying! How do i convert it to this, This is some text that has to be cleaned! its annoying!
26
votes
2 answers

What does "\1" represent in this Java string?

System.out.println("\1"); I thought it did not compile because of the non-recognized escape sequence. What does "\1" exactly represent?
Rollerball
  • 12,618
  • 23
  • 92
  • 161
25
votes
3 answers

Escaping null byte next to two zeroes

I need to escape the following sequence defined as a static final final String POSIX_SIGNATURE = "ustar".concat("\0").concat("00"); How would I escape this without using the .concat() method nor the + string operator? final String…
Gala
  • 2,592
  • 3
  • 25
  • 33
19
votes
2 answers

Is it possible to decode bytes to UTF-8, converting errors to escape sequences in Rust?

In Rust it's possible to get UTF-8 from bytes by doing this: if let Ok(s) = str::from_utf8(some_u8_slice) { println!("example {}", s); } This either works or it doesn't, but Python has the ability to handle errors, e.g.: s =…
ideasman42
  • 42,413
  • 44
  • 197
  • 320
18
votes
2 answers

Convert Unicode surrogate pair to literal string

I am trying to read a high Unicode character from one string into another. For brevity, I will simplify my code as shown below: public static void UnicodeTest() { var highUnicodeChar = ""; //Not the standard A var result1 = highUnicodeChar;…
hargle
  • 183
  • 1
  • 5
14
votes
1 answer

Best way to remove '\xad' in Python?

I'm trying to build a corpus from the .txt file found at this link. I believe the instances of \xad are supposedly 'soft-hyphens', but do not appear to be read correctly under UTF-8 encoding. I've tried encoding the .txt file as iso8859-15, using…
12
votes
2 answers

Why do I need to escape unicode in java source files?

Please note that I'm not asking how but why. And I don't know if it's a RCP specific problem or if it's something inherent to java. My java source files are encoded in UTF-8. If I define my literal strings like this : new Language("fr",…
Denys Séguret
  • 372,613
  • 87
  • 782
  • 758
10
votes
0 answers

`\u0027\n\u0027` equals `'\''` in Java?

I was playing around with Java Unicode Escapes and accidentally found the following interesting oddities. Here is the code that I wrote: static void main(String... args) { /* * \u0027 - single quote */ char e =…
Microtribute
  • 962
  • 10
  • 24
10
votes
1 answer

Is there an alternate spelling of \n in javascript?

Due to a bug in the Tumblr theme editor, any time the sequence \n appears in the source code, it is converted to an actual line break in the source code itself. Therefore, putting the \n sequence into a Javascript string causes the program to crash,…
14jbella
  • 505
  • 2
  • 14
10
votes
1 answer

How to encode Python 3 string using \u escape code?

In Python 3, suppose I have >>> thai_string = 'สีเ' Using encode gives >>> thai_string.encode('utf-8') b'\xe0\xb8\xaa\xe0\xb8\xb5' My question: how can I get encode() to return a bytes sequence using \u instead of \x? And how can I decode them…
Michael Currie
  • 13,721
  • 9
  • 42
  • 58
9
votes
1 answer

Python 3 - String with \xHH Hex Values to Unicode

I am trying to convert a string with characters that require multiple hex values like this: 'Mahou Shoujo Madoka\xe2\x98\x85Magica' to its unicode representation: 'Mahou Shoujo Madoka★Magica' When I print the string, it tries to evaluate each hex…
user14678939
  • 317
  • 1
  • 3
  • 7
9
votes
4 answers

How do I convert Unicode escape sequences to text in PHP?

I have this Unicode sequence: \u304a\u306f\u3088\u3046\u3054\u3056\u3044\u307e\u3059. How do I convert it into text? $unicode = '\u304a\u306f\u3088\u3046\u3054\u3056\u3044\u307e\u3059'; I tried: echo $utf8-decode(unicode); and I tried: echo…
learntosucceed
  • 1,109
  • 4
  • 11
  • 16
8
votes
4 answers

How to display the fraction 15/16 nicely in Unicode?

I learned today that while common fractions have dedicated Unicode values, in order to form less common fractions like ³/₁₆ you have to use superscript/subscript characters followed by a slash. This is confirmed here and here. This works for ¹¹/₁₆…
ktm5124
  • 11,861
  • 21
  • 74
  • 119
1
2 3
21 22