The loop for k,v := range s {}
iterates over unicode codepoints. In Golang they are called runes and are represented as 32-bit signed inegers:
For a string value, the "range" clause iterates over the Unicode code points in the string starting at byte index 0. On successive iterations, the index value will be the index of the first byte of successive UTF-8-encoded code points in the string, and the second value, of type rune, will be the value of the corresponding code point. If the iteration encounters an invalid UTF-8 sequence, the second value will be 0xFFFD, the Unicode replacement character, and the next iteration will advance a single byte in the string.
Golang specification
The indexing s[k]
returns the byte in the internal representation of the string.
The difference is easy to see for multibyte alphabets, such as Chinese. Try iterate the string "給祭断情試紀脱答条証行日稿" (it a meaningless lorem impsum phrase in chinese):
s[0]: 231 uint8 ç
:32102 int32 給
s[3]: 231 uint8 ç
:31085 int32 祭
s[6]: 230 uint8 æ
:26029 int32 断
See the step between the values of k
? It is due to utf-8
encoding of those chinese characters occupies 3 bytes.
Full example: https://go.dev/play/p/-44NZMojcgq