1

I'm faced with the problem of sorting a sequence of 32-bit signed integers (-2,147,483,648 to 2,147,483, 647) with input size n from 0 to 1m. The input numbers are separated by a new line and the first number in std in represents the number of inputs / numbers. I first read in the input using a buffered input reader

BufferedReader bufferedInput = new BufferedReader(new InputStreamReader(System.in));
int NOI = 0; // Number of inputs
try{
    NOI = Integer.parseInt(bufferedInput.readLine());
} 
catch(IOException e) {
}
int[] a = new int[NOI];
try{
    for(int i = 0; i < NOI; i++){
         a[i] = Integer.parseInt(bufferedInput.readLine());
    }
} 
catch(IOException e){
}

then I preform a radix sort (in place) on my array a, I then proceed to print my sorted array (ascending) to the std output with each number separated by a new line.

for(int i = 0; i < NOI; i++){
    System.out.println(Integer.toString(a[i]));
}

This is an optimization problem, I tested my code on ideone.com using input from appincredible.com/online/random-number-generator/ and found that for an input size of 1000 numbers my run times to 3sf for each step were as follows:

Read input - 14.5 ms

Sorting - 1.28 ms

Printing Output - 56.2 ms

I'm looking to make my solution to this problem more efficient and from the little bit of testing I've done I've noticed that the majority of runtime is on the output phase. How can I then, iterate through my array printing each element to a new line in std out more efficiently? Looking at What is the most efficient way to convert an int to a String? and Most efficient way of converting String to Integer in java suggest that the Integer.toString() is the most efficient for converting my integers to a string for output.

Any pointers? This is my first question here and I would love some help, I've looked around and can't find much like it. Possibly my issue lies in the data structure I use to store my array? I'm not too sure Thanks :)

Community
  • 1
  • 1
esal284
  • 23
  • 4
  • You can try to append all the output to a StringBuilder instance and print it with a single `System.out.print(sb.toString())` statement. – Eran May 08 '16 at 07:56
  • You should warm up your code for any microbenchmark and ignore at least the first two seconds. Just because 1K took 56 ms, it doesn't mean that 100K will take more than double that. – Peter Lawrey May 08 '16 at 08:39
  • Just as side remarks: you should only use variable names in capital letters for `static final` (constant) ones. `NOI` should become `noi`. The second remark is that you have to remaing that parsing an integer can throw a `NumberFormatException`. Maybe you catched it at a higher level though – C.Champagne May 08 '16 at 08:40

1 Answers1

1

I'm assuming this is some kind of programming competition. Otherwise, the answer is to not serialize this data as text in the first place. I would also be really careful introducing those optimizations in any "real world" program.

In such tight loops everything becomes expensive. You need to worry about things you would normally take for granted:

Avoid system calls

System.out.println will print each line right away. This requires a call to operating system - which are kinda expensive. It's much more efficient to gather a batch of output and write it all at once. You can do it manually using StringBuilder or use BufferedWriter:

BufferedWriter out = new BufferedWriter(new OutputStreamWriter(System.out));
for (int i = 0; i < n; i++) {
    out.write(a[i]+"\n");
}
out.flush();

Avoid object allocations

Each Integer.toString call creates new String object. Since the rest of this loop is very simple, this can take significant amount of time. StringBuilder.append(int) will append integer to its buffer without allocating a String:

StringBuilder sb = new StringBuilder();
for(int i = 0; i < n; i++){
    sb.append(a[i]);
    sb.append('\n');
}
System.out.print(sb)

StringBuilder will reallocate its internal buffer when it runs out of space. We know that n numbers won't take more than n*11 characters. We can reduce the number of allocations even further by allocating enough space from the beginning.

Use bytes instead of chars

The output will end up as bytes anyway. By converting numbers directly to bytes we are saving memory and skipping one extra step. Notice the output is written in reverse order to simplify integer conversion:

OutputStream out = System.out;
byte[] buffer = new byte[n*11];
int position = buffer.length;
for (int i = n - 1; i >= 0; i--) {
    int number = a[i];
    buffer[--position] = '\n';
    do {
        buffer[--position] = (byte) ('0' + number % 10);
        number /= 10;
    } while (number != 0);
}
out.write(buffer, position, buffer.length - position);

Before applying any of those ideas make sure to benchmark them first on real data. The most important part of optimizing is testing and making sure your assumptions match reality.

Here are my results for 1k, 10k and 100k lines of input:

                   1k      10k     100k
original           18ms    46ms    117ms
BufferedWriter     6ms     11ms    32ms
StringBuilder      3ms     6ms     23ms
bytes              1ms     2ms     8ms

Last but not least, make sure those optimizations are actually worth the added complexity and possible bugs. In particular, notice the last version doesn't handle negative numbers.

Piotr Praszmo
  • 17,928
  • 1
  • 57
  • 65