69

I have the following code, it is supposed to cast a rune into a string and print it. However, I am getting undefined characters when it is printed. I am unable to figure out where the bug is:

package main

import (
    "fmt"
    "strconv"
    "strings"
    "text/scanner"
)

func main() {
    var b scanner.Scanner
    const a = `a`
    b.Init(strings.NewReader(a))
    c := b.Scan()
    fmt.Println(strconv.QuoteRune(c))
}
icza
  • 389,944
  • 63
  • 907
  • 827
user3551708
  • 843
  • 1
  • 7
  • 14
  • 1
    Any reason that force you to use scanner? I mean you trying to figure out how "text/scanner" work or you just simple want to convert rune to string? – nvcnvn Aug 31 '16 at 10:40
  • 2
    `string(r)` to set a rune to a string. See https://stackoverflow.com/a/62739051/12817546. `[]rune("abc") ` to set a string to a rune. See https://stackoverflow.com/a/62739051/12817546. –  Jul 10 '20 at 00:00

5 Answers5

64

That's because you used Scanner.Scan() to read a rune but it does something else. Scanner.Scan() can be used to read tokens or runes of special tokens controlled by the Scanner.Mode bitmask, and it returns special constants form the text/scanner package, not the read rune itself.

To read a single rune use Scanner.Next() instead:

c := b.Next()
fmt.Println(c, string(c), strconv.QuoteRune(c))

Output:

97 a 'a'

If you just want to convert a single rune to string, use a simple type conversion. rune is alias for int32, and converting integer numbers to string:

Converting a signed or unsigned integer value to a string type yields a string containing the UTF-8 representation of the integer.

So:

r := rune('a')
fmt.Println(r, string(r))

Outputs:

97 a

Also to loop over the runes of a string value, you can simply use the for ... range construct:

for i, r := range "abc" {
    fmt.Printf("%d - %c (%v)\n", i, r, r)
}

Output:

0 - a (97)
1 - b (98)
2 - c (99)

Or you can simply convert a string value to []rune:

fmt.Println([]rune("abc")) // Output: [97 98 99]

There is also utf8.DecodeRuneInString().

Try the examples on the Go Playground.

Note:

Your original code (using Scanner.Scan()) works like this:

  1. You called Scanner.Init() which sets the Mode (b.Mode) to scanner.GoTokens.
  2. Calling Scanner.Scan() on the input (from "a") returns scanner.Ident because "a" is a valid Go identifier:

    c := b.Scan()
    if c == scanner.Ident {
        fmt.Println("Identifier:", b.TokenText())
    }
    
    // Output: "Identifier: a"
    
icza
  • 389,944
  • 63
  • 907
  • 827
  • you can also use "%#U" as a format tag `fmt.Printf("%v\t%#U\t%s\n", r, r, strconv.QuoteRune(r))` prints `51 U+0033 '3' '3'` – tuxErrante May 10 '22 at 08:11
  • For newbies, need example without `fmt.Println`, just how to get value into variable or as function return. – Arthur Shlain Jul 28 '22 at 12:27
4

I know I'm a bit late to the party but here's a []rune to string function:

func runesToString(runes []rune) (outString string) {
    // don't need index so _
    for _, v := range runes {
        outString += string(v)
    }
    return
}

yes, there is a named return but I think it's ok in this case as it reduces the number of lines and the function is only short

MarkJL
  • 485
  • 1
  • 9
  • 17
  • 24
    `[]rune` to `string` conversion is supported natively by the spec, this is completely unnecessary and slow. Instead use a simple conversion: `outString := string(runes)` – icza Oct 12 '17 at 08:25
  • @icza enlightening comment. I quoted you here https://stackoverflow.com/a/62739051/12817546. –  Jul 07 '20 at 08:50
4

This simple code works in converting a rune to a string

s := fmt.Sprintf("%c", rune)
kidustiliksew
  • 443
  • 3
  • 11
3

Since I came to this question searching for rune and string and char, thought this may help newbies like me

// str := "aഐbc"
// testString(str)
func testString(oneString string){

    //string to byte slice - No sweat -just type cast it
    // As string  IS A byte slice
    var twoByteArr []byte = []byte(oneString)

    // string to rune Slices - No sweat 
    // string IS A slice of runes 
    var threeRuneSlice []rune = []rune(oneString)

   // Hmm! String seems to have a dual personality it is both a slice of bytes and
   // a slice of runes - yeah - read on
    
    // A rune slice can be convered to string -
    // No sweat - as string == rune slice
    var thrirdString string = string(threeRuneSlice)
    
    // There is a catch here and that is in printing "characters", using for loop and range 
    
    fmt.Println("Chars in oneString")
    for i,r := range oneString {
        fmt.Printf(" %d  %v  %c ",i,r,r) //you may not get index 0,1,2,3 here  
        // since the range runs specially over strings  https://blog.golang.org/strings
    }
    
    fmt.Println("\nChars in threeRuneSlice")
    for i,r := range threeRuneSlice {
        fmt.Printf(" %d  %v  %c ",i,r,r) // i = 0,1,2,4 , perfect!!
        // as runes are made up of 4 bytes (rune is int32 and byte in unint8
        // and a set of bytes is used to represent a rune which is used to 
       // represent  UTF characters == the REAL CHARECTER 
    }

    fmt.Println("\nValues in oneString ")
    for j := 0; j < len(oneString); j++ {
        fmt.Printf(" %d %v ",j,oneString[j]) // No you cannot get charecters if you iterate through string in this way
        // as you are going over bytes here - not runes
    }
    fmt.Println("\nValues in twoByteArr")
    for j := 0; j < len(twoByteArr); j++ {
        fmt.Printf(" %d=%v ",j,twoByteArr[j]) // == same as above
    }
    
    fmt.Printf("\none - %s, two %s, three %s\n",oneString,twoByteArr,thrirdString)
}

And some more pointless demo https://play.golang.org/p/tagRBVG8k7V adapted from https://groups.google.com/g/golang-nuts/c/84GCvDBhpbg/m/Tt6089MPFQAJ

to show that the 'characters' are encoded with one to up to 4 bytes depending on the unicode code point

Alex Punnen
  • 5,287
  • 3
  • 59
  • 71
0

Provide simple examples to understand how to do it quickly.

// rune => string
fmt.Printf("%c\n", 65) // A
fmt.Println(string(rune(0x1F60A))) // 
fmt.Println(string([]rune{0x1F468, 0x200D, 0x1F9B0})) // ‍

// string => rune
fmt.Println(strconv.FormatUint(uint64([]rune("")[0]), 16)) // 1f60a
fmt.Printf("%U\n", '') // U+1F60A
fmt.Printf("%U %U %U\n", '', '‍', '') // U+1F468 U+200D U+1F9B0

go playground

Carson
  • 6,105
  • 2
  • 37
  • 45