If you run fmt.Println("\u554a")
, it shows '啊'.
But how to get unicode-style-string \u554a
from a rune '啊' ?
If you run fmt.Println("\u554a")
, it shows '啊'.
But how to get unicode-style-string \u554a
from a rune '啊' ?
package main
import "fmt"
import "strconv"
func main() {
quoted := strconv.QuoteRuneToASCII('啊') // quoted = "'\u554a'"
unquoted := quoted[1:len(quoted)-1] // unquoted = "\u554a"
fmt.Println(unquoted)
}
This outputs:
\u554a
IMHO, it should be better:
func RuneToAscii(r rune) string {
if r < 128 {
return string(r)
} else {
return "\\u" + strconv.FormatInt(int64(r), 16)
}
}
You can use fmt.Sprintf
along with %U
to get the hexadecimal value:
test = fmt.Sprintf("%U", '啊')
fmt.Println("\\u" + test[2:]) // Print \u554A
For example,
package main
import "fmt"
func main() {
r := rune('啊')
u := fmt.Sprintf("%U", r)
fmt.Println(string(r), u)
}
Output:
啊 U+554A
fmt.Printf("\\u%X", '啊')
http://play.golang.org/p/Jh9ns8Qh15
(Upper or lowercase 'x' will control the case of the hex characters)
As hinted at by package fmt's documentation:
%U Unicode format: U+1234; same as "U+%04X"
I'd like to add to the answer that hardPass has.
In the case where the hex representation of the unicode is less that 4 characters (ü for example) strconv.FormatInt
will result in \ufc
which will result in a unicode syntax error in Go. As opposed to the full \u00fc
that Go understands.
Padding the hex with zeros using fmt.Sprintf
with hex formatting will fix this:
func RuneToAscii(r rune) string {
if r < 128 {
return string(r)
} else {
return fmt.Sprintf("\\u%04x", r)
}
}
This would do the job..
package main
import (
"fmt"
)
func main() {
str := fmt.Sprintf("%s", []byte{0x80})
fmt.Println(str)
}