For statements with range clause: (Link)
For a string value, the "range" clause iterates over the Unicode code points in the string starting at byte index 0. On successive iterations, the index value will be the index of the first byte of successive UTF-8-encoded code points in the string, and the second value, of type rune, will be the value of the corresponding code point. If the iteration encounters an invalid UTF-8 sequence, the second value will be 0xFFFD, the Unicode replacement character, and the next iteration will advance a single byte in the string.
Now let's look at the types: (Link)
// byte is an alias for uint8 and is equivalent to uint8 in all ways. It is
// used, by convention, to distinguish byte values from 8-bit unsigned
// integer values.
type byte = uint8
// rune is an alias for int32 and is equivalent to int32 in all ways. It is
// used, by convention, to distinguish character values from integer values.
type rune = int32
So this explains why int32
is for a rune
, and uint8
is for a byte
.
Here's some code to make the point clear. I've added some code and changed the
string to make it better. I hope the comments are self-explanatory. Also, I'd recommend reading: https://blog.golang.org/strings as well.
package main
import (
"fmt"
"reflect"
)
func main() {
// Changed the string for better understanding
// Each character is not of single byte
s := "日本語"
// Range over the string, where x is a rune
for _, x := range s {
kx := reflect.ValueOf(x).Kind()
fmt.Printf(
"Type of x is %v (%c)\n",
kx,
x, // Expected (rune)
)
break
}
// Indexing (First byte of the string)
y := s[0]
ky := reflect.ValueOf(y).Kind()
fmt.Printf(
"Type of y is %v (%c)\n",
ky,
y,
/*
Uh-oh, not expected. We are getting just the first byte
of a string and not the full multi-byte character.
But we need '日' (3 byte character).
*/
)
// Indexing (First rune of the string)
z := []rune(s)[0]
kz := reflect.ValueOf(z).Kind()
fmt.Printf(
"Type of z is %v (%c)\n",
kz,
z, // Expected (rune)
)
}
Sample output:
Type of x is int32 (日)
Type of y is uint8 (æ)
Type of z is int32 (日)
Note: In case your terminal is not showing the same output; there might be some issue with character encoding settings. So, changing that might help.