1

Consider the scenario of competitive programming, I have to read 2*10^5 (or Even more ) numbers from console . Then I use BufferedReader or for even fast performance I use custom reader class that uses DataInputStream under the hood.

Quick Internet search given me this .

We can use java.io for smaller streaming of data and for large streaming we can use java.nio.

So I want to try java.nio console input and test it against the java.io performance .

  1. Is it possible to read console input using java.nio ?
  2. Can I read data from System.in using java.nio ?
  3. Will it be faster than input methods that I currently have ?

Any relevant information will be appreciated.

Thanks ✌️

  • 4
    By “console” you mean the thing, the user types into? What relevance has “fast” in this context? That said a) generally, `BufferedReader` does not speed up anything, if use a sufficiently large buffer for read in the first place. It can accelerate applications that make the mistake of reading char by char from a file, but for console where the user truly types char by char, `BufferedReader` will make it worse. b) There is no reason why `DataInputStream` should be faster than an ordinary `InputStream`. c) You can create a `Channel` for stdin, but NIO is no magic bullet, expect same performance. – Holger May 28 '20 at 07:24
  • Console means like in competitive programming we read from standard input i.e. System.in which is having large input test cases –  May 28 '20 at 07:27
  • 1
    Which format do the numbers have? – Holger May 28 '20 at 07:46
  • Decimal format mostly like numbers are separated by whitespace –  May 28 '20 at 07:48

1 Answers1

2

You can open a channel to stdin like

FileInputStream stdin = new FileInputStream(FileDescriptor.in);
FileChannel stdinChannel = stdin.getChannel();

When stdin has been redirected to a file, operations like querying the size, performing fast transfers to other channels and even memory mapping may work. But when the input is a real console or a pipe or you are reading character data, the performance is unlikely to differ significantly.

The performance depends on the way you read it, not the class you are using.

An example of code directly operating on a channel, to process white-space separated decimal numbers, is

CharsetDecoder cs = Charset.defaultCharset().newDecoder();
ByteBuffer bb = ByteBuffer.allocate(1024);
CharBuffer cb = CharBuffer.allocate(1024);
while(stdinChannel.read(bb) >= 0) {
    bb.flip();
    cs.decode(bb, cb, false);
    bb.compact();
    cb.flip();
    extractDoubles(cb);
    cb.compact();
}
bb.flip();
cs.decode(bb, cb, true);
if(cb.position() > 0) {
    cb.flip();
    extractDoubles(cb);
}
private static void extractDoubles(CharBuffer cb) {
    doubles: for(int p = cb.position(); p < cb.limit(); ) {
        while(p < cb.limit() && Character.isWhitespace(cb.get(p))) p++;
        cb.position(p);
        if(cb.hasRemaining()) {
            for(; p < cb.limit(); p++) {
                if(Character.isWhitespace(cb.get(p))) {
                    int oldLimit = cb.limit();
                    double d = Double.parseDouble(cb.limit(p).toString());
                    cb.limit(oldLimit);
                    processDouble(d);
                    continue doubles;
                }
            }
        }
    }
}

This is more complicated than using java.util.Scanner or a BufferedReader’s readLine() followed by split("\\s"), but has the advantage of avoiding the complexity of the regex engine, as well as not creating String objects for the lines. When there are more than one number per line or empty lines, i.e. the line strings would not not match the number strings, this can save the copying overhead intrinsic to string construction.

This code is still handling arbitrary charsets. When you know the expected charset and it is ASCII based, using a lightweight transformation instead of the CharsetDecoder, like shown in this answer, can gain an additional performance increase.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • 1
    Either, you read into a `ByteBuffer` and decode yourself, or you use [`Channels.newReader(…)`](https://docs.oracle.com/javase/8/docs/api/java/nio/channels/Channels.html#newReader-java.nio.channels.ReadableByteChannel-java.nio.charset.CharsetDecoder-int-), to end up at a `Reader` again. As said, the fundamental operations do not change, especially when you want to read character data. Since you said, you want to read mostly decimal numbers separated by white-space, you should try `java.util.Scanner`. – Holger May 28 '20 at 08:02
  • Scanner is very slow https://stackoverflow.com/questions/7049011/whats-the-fastest-way-to-read-from-system-in-in-java –  May 28 '20 at 08:04
  • 1
    1) that linked answer only addresses reading integers, not general decimal numbers 2) It does not document the testing method 3) It is nine years old. So it does not provide a general statement regarding performance. But if you think, you can do better without `Scanner`, feel free to do it. 4) This answer addresses the question as has been asked. There is no need to update it. – Holger May 28 '20 at 08:14
  • You did only added how to create channel but if you add any testcase that demonstrates how to read data would be better –  May 28 '20 at 08:28
  • 1
    [Pattern matching in Thousands of files](https://stackoverflow.com/a/52062570/2711488) contains some pointers for doing such operations really fast. – Holger May 28 '20 at 08:52
  • 1
    I expanded the answer with an example. – Holger May 28 '20 at 10:08