0

I need to convert strings to byte slices, I use the function []byte(string), but when the string has the letter ñ or some letter with an accent I get a different value than expected.

fmt.Println([]byte("áéíóúñÁÉÍÓÚÑ"))

Expected result: [ 160 130 161 162 163 181 144 214 224 233 ]

Obtained result: [195 161 195 169 195 173 195 179 195 186 195 177 195 129 195 137 195 141 195 147 195 154 195 145]

So when I convert to string the obtained value I get ├í├®├¡├│├║├▒├ü├ë├ì├ô├Ü├æ instead of áéíóúñÁÉÍÓÚÑ

How can I get the right values?

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
  • 1
    Why do you expect to get these numbers you have shown? Why do you expect only 10 numbers when there are 12 letters in the string? – mkrieger1 Apr 04 '21 at 21:26
  • 2
    `fmt.Println(string([]byte("áéíóúñÁÉÍÓÚÑ")))` prints the original string perfectly fine for me on the [playground](https://play.golang.org/p/xnBcyWDy0IG). – FObersteiner Apr 04 '21 at 21:37
  • I expect to get the numbers that I show because they are the ASCII codes of the letters that I need. https://theasciicode.com.ar/ And sorry, I forgot to put the values of ñÑ, which would be 164 165 And yes, go prints the original string perfectly, but where I send it prints character by character, so it prints `195 161 = ├í` instead of `á` – Eduardo Pacheco Apr 04 '21 at 21:58
  • 2
    Extended ASCII is not the same as ASCII. Also, extended ASCII is not even a standardized thing, there are many different versions. Go source code uses UTF-8 encoding, which is a superset of plain ASCII only, not whatever extended version you found somewhere out in the WWW. – Hymns For Disco Apr 04 '21 at 22:15
  • 2
    Some useful reading: https://www.joelonsoftware.com/2003/10/08/ and https://stackoverflow.com/questions/19212306 – Hymns For Disco Apr 04 '21 at 22:17

2 Answers2

2

Several issues here. First, you give this expected result:

[ 160 130 161 162 163 181 144 214 224 233 ]

but you left out the ñ and Ñ, so expected result should be:

[160 130 161 162 163 164 181 144 214 224 233 165]

Second, this page you link to [1] says it is code page 437, but it's actually code page 850. You can see 850 listed under "other related encodings" [2]. Here is a working example [3]:

package main

import (
   "fmt"
   "golang.org/x/text/encoding/charmap"
)

func main() {
   b := []byte("áéíóúñÁÉÍÓÚÑ")
   c, e := charmap.CodePage850.NewEncoder().Bytes(b)
   if e != nil {
      panic(e)
   }
   fmt.Println(c)
}
  1. https://theasciicode.com.ar
  2. https://wikipedia.org/wiki/Code_page_437
  3. https://pkg.go.dev/golang.org/x/text/encoding/charmap
halfer
  • 19,824
  • 17
  • 99
  • 186
Zombo
  • 1
  • 62
  • 391
  • 407
2

For these characters you could use ascii85 encoder/decoder.

The byte slice will not match your expectation, however, the output will match your input. (I'm assuming that's the critical thing here)

package main

import (
    "encoding/ascii85"
    "fmt"
)

func main() {
        enc := make([]byte, 30, 30)
        dec := make([]byte, 30, 30)
        ascii85.Encode(enc, []byte("áéíóúñÁÉÍÓÚÑ"))
        ascii85.Decode(dec, enc, false)
        fmt.Println(enc)        
        fmt.Println(string(dec))
}

https://golang.org/pkg/encoding/ascii85/

https://play.golang.org/p/ErBSKYVBXNg

CharlieGo_
  • 73
  • 6