14

I found this, https://groups.google.com/forum/#!topic/golang-nuts/YyKlLwuWt3w but as far as I can tell, the solutions didn't work for me.

If you use the method of treating a string as a slice(str[:20]), it breaks off in the middle of characters and we get "ال�".

Edit: I believe I could write a function and do it as a multiple of 3's as runes are int32 (32bits/(8bits/byte)). I would first have to check if there are runes.

John
  • 3,037
  • 8
  • 36
  • 68

4 Answers4

32

Just convert it to a slice of runes first, slice, then convert the result back:

string([]rune(str)[:20])
  • 3
    Beware that this may panic with panic: runtime error: slice bounds out of range if string size is already less than what you are truing to cut – dr.scre Nov 30 '18 at 15:25
  • Doing this takes O(len(str)) time, so it's best to leave the string in []rune representation until you need it in the variable width string form again. – Michael Fulton Sep 30 '21 at 19:03
  • Warning: this can modify the characters at the end of the substring because one character can consist of multiple runes, e.g. https://go.dev/play/p/6u7gx1CQxTW – Jason Stangroome Jan 03 '23 at 21:13
13

You can get a substring of a UTF-8 string without allocating additional memory (you don't have to convert it to a rune slice):

func substring(s string, start int, end int) string {
    start_str_idx := 0
    i := 0
    for j := range s {
        if i == start {
            start_str_idx = j
        }
        if i == end {
            return s[start_str_idx:j]
        }
        i++
    }
    return s[start_str_idx:]
}

func main() {
    s := "世界 Hello"
    fmt.Println(substring(s, 0, 1)) // 世
    fmt.Println(substring(s, 1, 5)) // 界 He
    fmt.Println(substring(s, 3, 8)) // Hello
}
KAdot
  • 1,997
  • 13
  • 21
3

Here's a length-based implementation based on the rune trick:

func substr(input string, start int, length int) string {
    asRunes := []rune(input)

    if start >= len(asRunes) {
        return ""
    }

    if start+length > len(asRunes) {
        length = len(asRunes) - start
    }

    return string(asRunes[start : start+length])
}
joonas.fi
  • 7,478
  • 2
  • 29
  • 17
2

If you don't mind experimental package, you can use this:

package main
import "golang.org/x/exp/utf8string"

func main() {
   a := utf8string.NewString("ÄÅàâäåçèéêëìîïü")
   s := a.Slice(1, 3)
   println(s == "Åà")
}

https://pkg.go.dev/golang.org/x/exp/utf8string

Nimantha
  • 6,405
  • 6
  • 28
  • 69
Zombo
  • 1
  • 62
  • 391
  • 407