2

Go has very nice io.Reader and io.Writer interfaces, that roughly correspond to java.io.InputStream and java.io.OutputStream classes in Java (i.e. io.Reader = stream of bytes, io.Writer = sink for bytes).

I'm wondering, if Go also has some equivalent of java.io.Reader (stream of characters) and java.io.Writer (sink for characters) classes.

What I want is to read/write string from/to stream, without thinking about the encoding all the time + support for different encodings than just UTF8 (which is what combination of io.Reader + string typecasting gives me almost for free).

Update: so I guess my description is confusing, and I should have avoided comparison to Java stuff. What I want is something like this:

Reader part, read next 128 bytes, and return Go string by decoding bytes using ISO-8859-2 charset.

stringReader := NewStringReader(reader, "iso-8859-2")
stringReader.read(128)

Writer part, convert stuff to UTF16-BE bytes, and write it to the writer:

stringWriter := NewStringWriter(writer, "utf16be")
stringWriter.write("馞鮂 擙樲橚 褗褆諓");
Peter Štibraný
  • 32,463
  • 16
  • 90
  • 116
  • 1
    java.io.InputStream is a Java class while io.Reader is a Go interface, they don't correspond to each other. You can find some utilities in package "io/ioutil". – rvignacio Sep 03 '14 at 18:57
  • Can you show an example of what you're trying to do? The comparison to Java doesn't really make sense. – JimB Sep 03 '14 at 19:21
  • @rvignacio: I have written my share of java.io.InputStream subclasses with different behaviours to know that they do in fact have very close correspondence. But that's not the point of the question. – Peter Štibraný Sep 03 '14 at 19:22
  • @JimB: I want to read characters and strings (runes/strings in Go) from Reader-like object, that wraps real Reader + takes charset to decode bytes into characters. And same for writing. – Peter Štibraný Sep 03 '14 at 19:23
  • Something different than what `bufio` provides with *Rune functions and methods? – JimB Sep 03 '14 at 19:25
  • 1
    @JimB I *think* he wants something like https://code.google.com/p/go-charset/source/browse/charset/example_test.go – OneOfOne Sep 03 '14 at 19:28
  • 1
    See [this answer](http://stackoverflow.com/a/31544542/55504) for an example of using the [`golang.org/x/text/encoding`](https://godoc.org/golang.org/x/text/encoding) package to encode-to/decode-from non-UTF-8 streams on the fly. – Dave C Jul 23 '15 at 15:08

1 Answers1

3

I'm not familiar enough with Java but wouldn't this do the same thing:

// w = io.Writer
io.WriteString("stuff")

// r = io.Reader
sc := bufio.NewScanner(r)
for scanner.Scan() {
    fmt.Println(sc.Text())
}

strings are a readonly []byte more or less.

//edit

After reading the comments, I think you're looking for https://code.google.com/p/go-charset:

r, err := charset.NewReader("latin1", r)
if err != nil {
        log.Fatal(err)
}
result, err := ioutil.ReadAll(r)
if err != nil {
        log.Fatal(err)
}
fmt.Printf("%s\n", result)
OneOfOne
  • 95,033
  • 20
  • 184
  • 185
  • 1
    Thanks for understanding my confused question... this looks promising! – Peter Štibraný Sep 03 '14 at 19:34
  • Possible tricky bit: I'm not sure whether this interface promises that a multibyte character won't straddle a block; it'd be easy to test by reading a block entirely made of chars that take three UTF-8 bytes into a power-of-2-sized block. It's probably possible to write a `RuneReader` that sits on top of a utf-8 Reader, or just something that does the bookkeeping to ensure no characters straddle blocks, but neither is trivial (or free perfwise). – twotwotwo Sep 04 '14 at 01:58
  • (Neither here nor there really, but I [wrote something](http://play.golang.org/p/S99auj-ztY) to find the end of the last utf8 char in a block that begins at character start. That would be a piece, but a small one, of a Reader that knew how to ensure blocks consisted of whole utf8 characters.) – twotwotwo Sep 04 '14 at 03:54
  • 1
    The [`golang.org/x/text/encoding`](https://godoc.org/golang.org/x/text/encoding) package is superior to `go-charset` IMO. – Dave C Jul 23 '15 at 15:09