0

I know that I can do this. But I also want to know, is there a short way to do this ? For example: Why there is no method that has public String readString(int len); prototype in Reader class hierarchy to do what I want with only single code in this question ?

InputStream in = new FileInputStream("abc.txt");
InputStreamReader inReader = new InputStreamReader(in);
char[] foo = new char[5];
inReader.read(foo);
System.out.println(new String(foo));

// I think this way is too long
// for reading a string that has only 5 character
// from InputStream or Reader

In Python 3 programming language, I can do it very very easy for UTF-8 and another files. Consider the following code.

fl = open("abc.txt", mode="r", encoding="utf-8")
fl.read(1) # returns string that has 1 character
fl.read(3) # returns string that has 3 character

How can I dot it in Java ?

Thanks.

1 JustOnly 1
  • 171
  • 11
  • You're asking us to speculate why the API designers didn't add that specific method? I'm speculating that they didn't consider it important, since there are other ways to do it, as you found yourself. --- Also, how should it really work? Should it read 5 UTF-16 characters like your code does? Or should it read 5 Unicode characters (code points)? – Andreas Jun 28 '18 at 18:43
  • 1
    *FYI:* You're ignoring the return value of `read(foo)`. *Don't* do that!!! What if the file only had 3 characters? – Andreas Jun 28 '18 at 18:44
  • @Andreas, This is only illustration for what I want to do. You're right. – 1 JustOnly 1 Jun 28 '18 at 18:50
  • Can you show me the solution for doing this very simple ? – 1 JustOnly 1 Jun 28 '18 at 18:52
  • You have the solution for doing it very simple. If you need to do it in multiple places, create a helper method to do it, then call the method, e.g. `public static String readFixedLengthString(Reader r, int len)` – Andreas Jun 28 '18 at 18:54
  • @Andreas, Thanks. Is it right choice to use helper methods in these cases ? – 1 JustOnly 1 Jun 28 '18 at 18:58
  • 2
    Is a helper method the right choice? Of course it is. See [the DRY principle](https://www.google.com/search?q=the+DRY+principle) ("Don’t Repeat Yourself"). – Andreas Jun 28 '18 at 19:00
  • @Andreas can you examine my question again? I have edited it. – 1 JustOnly 1 Jun 29 '18 at 19:46
  • The addition to the question text doesn't change anything. Why would you think that *"but I can do it like this in Python"* has any impact on a Java solution? Java doesn't have a *built-in* `String read(int len)` method, so do it yourself, e.g. as a reusable helper method, or go find a 3rd-party library that already added such a helper method (and don't ask here for reference to such library, since that is off-topic for StackOverflow). – Andreas Jun 29 '18 at 20:47

2 Answers2

1

How can I do it in Java ?

The way you're already doing it.

I'd recommend doing it in a reusable helper method, e.g.

final class IOUtil {
    public static String read(Reader in, int len) throws IOException {
        char[] buf = new char[len];
        int charsRead = in.read(buf);
        return (charsRead == -1 ? null : new String(buf, 0, charsRead));
    }
}

Then use it like this:

try (Reader in = Files.newBufferedReader(Paths.get("abc.txt"), StandardCharsets.UTF_8)) {
    System.out.println(IOUtil.read(in, 5));
}
Andreas
  • 154,647
  • 11
  • 152
  • 247
  • But what should I do when the character has three bytes or 4 ? How can I check how many bytes the character has ? because char data type has only two bytes. and an UTF-8 character can have 3 or 4 bytes. So if I read only 1 character that has three or four bytes with Java char data type, one or two byte will not be read. So this means this helper method can't work correctly all the time. – 1 JustOnly 1 Jun 30 '18 at 00:20
  • @1JustOnly1 Sorry, missed the UTF-8 part. Answer updated. The `Reader` will convert UTF-8 *bytes* to Java *characters*. The only problem is when characters are from a [Unicode supplementary plane](https://en.wikipedia.org/wiki/Plane_(Unicode)), i.e. when it is a [surrogate pair](https://stackoverflow.com/q/5903008/5221149) of characters, the code above will split the pair. But then again, you never defined whether `len` is a `char` count or a Unicode code point count. – Andreas Jun 30 '18 at 15:01
  • But `Reader.read(char[])` doesn’t guaranty to fill the entire array. – Holger Jul 03 '18 at 12:51
  • @Holger True, but it will when reading from a file (except if end-of-file). If reading a communication stream (e.g. HTTP), it will only read what has been received so far, since it cannot know if more will arrive, but will wait for at least 1 byte. If you might use that method for comm. readers, you can add loop to continue reading until `len` characters have been read (or end-of-data). – Andreas Jul 03 '18 at 15:49
  • Exactly. That’s what I did in my answer, but using a `CharBuffer` that saves the developers from dealing with position and remaining length manually. For uncompressed local files, a single read usually is sufficient, but it’s always worth mentioning the limitations of a solution explicitly. – Holger Jul 03 '18 at 18:31
1

If you want to make a best effort to read as many as the specified number of characters, you may use

int len = 4;
String result;
try(Reader r = new FileReader("abc.txt")) {
    CharBuffer b = CharBuffer.allocate(len);
    do {} while(b.hasRemaining() && r.read(b) > 0);
    result = b.flip().toString();
}
System.out.println(result);

While the Reader may read less than the specified characters (depending on the underlying stream), it will read at least one character before returning or return -1 to signal the end of the stream. So the code above will loop until either, having read the requested number of characters or reached the end of the stream.

Though, a FileReader will usually read all requested characters in one go and read only less when reaching the end of the file.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • Good use of `CharBuffer`. – Andreas Jul 03 '18 at 18:37
  • I think, reading characters one by one cannot be good solution. – 1 JustOnly 1 Jul 03 '18 at 22:57
  • @1JustOnly1 this does *not* read one by one. It always tries to fill the entire buffer, just like with the other answer. As said in the answer, in case of a `FileReader`, that’s already it. There will only be one `read` call. Certain other streams, like network streams, decrypting or uncompressing streams, may not always fulfill the entire request but use their own buffer’s size, but still, it’s usually not only one character. The documentation guarantees that *at least* one character will be read, which implies that this loop will not waste CPU cycles when no characters are available yet. – Holger Jul 04 '18 at 05:20