16

I need to slice a string in Go. Possible values can contain Latin chars and/or Arabic/Chinese chars. In the following example, the slice annotation [:1] for the Arabic string alphabet is returning a non-expected value/character.

    package main
    
    import "fmt"
    
    func main() {
        a := "a"
        fmt.Println(a[:1]) // works
        
        b := "ذ"
        fmt.Println(b[:1]) // does not work
        fmt.Println(b[:2]) // works
    
        fmt.Println(len(a) == len(b)) // false
    }

http://play.golang.org/p/R-JxaxbfNL

Jonathan Simon Prates
  • 1,122
  • 2
  • 12
  • 28

2 Answers2

30

First of all, you should really read about strings, bytes and runes in Go.

And here is how you can achieve what you want: Go playground (I was not able to properly paste arabic symbols, but if Chinese works, arabic should work too).

    s := "abcdefghijklmnop" 
    fmt.Println(s[2:9]) 

    s = "维基百科:关于中文维基百科" 
    fmt.Println(string([]rune(s)[2:9]))

The output is:

cdefghi
百科:关于中文
Salvador Dali
  • 214,103
  • 147
  • 703
  • 753
  • 10
    It worked. Thanks. Note: Instead use len(s), I have used utf8.RuneCountInString(s) to get string size. Function len(s) counts bytes, not chars. http://golang.org/pkg/builtin/#len – Jonathan Simon Prates Jul 15 '15 at 15:04
0

You can use the utf8string package:

package main
import "golang.org/x/exp/utf8string"

func main() {
   a := utf8string.NewString("")
   // example 1
   r := a.At(1)
   // example 2
   s := a.Slice(1, 3)
   // example 3
   n := a.RuneCount()
   // print
   println(r == '', s == "", n == 5)
}

https://pkg.go.dev/golang.org/x/exp/utf8string

Zombo
  • 1
  • 62
  • 391
  • 407