1

In string k below, there are 5 characters I can count as runes because its in unicode. But if I count the runes in the string s which contains emojis the variant rune is included so I count 7 instead of 3 characters. If I'm scanning arbitrary unicode strings how do I not count variant runes? This is not a duplicate because there are 7 runes but 3 emojis. For Unicode the duplicate answer works but not for emojis. https://goplay.space/#s7V7IrwYznd

package main

import (
    "fmt"
)

func main() {
    s := "️❤️☠️㊗️"
    fmt.Println([]byte(s)) // [239 184 143 226 157 164 239 184 143 226 152 160 239 184 143 227 138 151 239 184 143]
    fmt.Println([]rune(s)) // [65039 10084 65039 9760 65039 12951 65039]

    k := "チリヌルヲ"
    fmt.Println([]byte(k)) // [227 131 129 227 131 170 227 131 140 227 131 171 227 131 178]
    fmt.Println([]rune(k)) // [12481 12522 12492 12523 12530]
}
Andrew Bucknell
  • 1,880
  • 3
  • 21
  • 32
  • You want to count the number of Unicode grapheme clusters. – peterSO Aug 30 '19 at 12:57
  • [This answer](https://stackoverflow.com/a/39425959/2896976) helps to explain the 'why' around seeing 7 runes but only 3 characters. For example, the heart uses 2 runes. The first rune is the heart's shape, and the second rune is its color (red). – Jessie Aug 30 '19 at 15:53
  • [This answer](https://stackoverflow.com/a/36258684/2896976) is likely what you are looking for. – Jessie Aug 30 '19 at 15:54

0 Answers0