0

I'm trying to figure out the term for these types of characters:

\M-C\M-6 (corresponds to german "ö")

\M-C\M-$ (corresponds to german "ä")

\M-C\M^_ (corresponds to german "ß")

I want to know the term for these outputs so that I can easily convert them into the utf-8 character they actually are in golang instead of creating a mapping of each I come across.

What is the term for these? unicode? What would be the best way to convert these "characters" to their actual human readable character in golang?

rmaddy
  • 314,917
  • 42
  • 532
  • 579
Steve
  • 149
  • 1
  • 11

1 Answers1

1

It is the vis encoding of UTF-8 encoded text.

Here's an example:

The UTF-8 encoding of the rune ö in bytes is [0303, 0266].

vis encodes the byte 0303 as the bytes \M-C and the byte 0266 as the bytes \M-6.

Putting the two levels of encoding together, the rune ö is encoded as the bytes \M-C\M-6.

You can either write an decoder using the documentation on the man page or search for a decoding package. The Go standard library does not include such a decoder.

Charlie Tumahai
  • 113,709
  • 12
  • 249
  • 242
  • The [govis](https://godoc.org/github.com/cyphar/govis) package may be helpful in decoding vis. I found the package by searching for "vis golang". I make no claims about the suitability of the package. – Charlie Tumahai Jan 24 '19 at 01:56