1

I wrote a program in both C++ and Java to print "Hello World" 100,000 times, but I noticed that the C++ code takes too long compared to the Java code; The Java code takes about 6 seconds averagely and the C++ code takes about 18 seconds averagely, both run from the command line; Can someone please explain why, thanks.

The name of the program is first.java and first.cpp for Java and C++ respectively I used: java first.java; and first.exe; both from the command line

g++ --version g++ (Rev6, Built by MSYS2 project) 11.2.0

java --version java 13.0.2, 2020-01-14

Java Code

class first {
    public static void main(String... args) {
        long start = System.currentTimeMillis();

        for (int i = 0; i < 100000; i++) {
            System.out.println("Hello World");
        }

        long end = System.currentTimeMillis();

        long dur = end - start;
        System.out.println(dur / 1000);
    }

}

C++ Code

#include <iostream>
#include <string>
#include <chrono>

using namespace std;

int main()
{
    auto start = std::chrono::system_clock::now();
    for (int i = 0; i < 100000; i++)
    {
        cout << "Hello World" << endl;
    }
    auto end = std::chrono::system_clock::now();

    std::chrono::duration<double> elapsed_seconds = end - start;
    cout << elapsed_seconds.count() << endl;
}
Jason
  • 36,170
  • 5
  • 26
  • 60
Atolz
  • 19
  • 4
  • 2
    The two programs are *not* doing the same thing. If you want them to be closer equivalents, add a call to `System.out.flush();` to the loop in your Java code. – Konrad Rudolph Dec 07 '22 at 09:58
  • 1
    You may also be interested by [this](https://en.cppreference.com/w/cpp/io/ios_base/sync_with_stdio#:~:text=If%20the%20synchronization%20is%20turned%20off%2C%20the%20C%2B%2B%20standard%20streams%20are%20allowed%20to%20buffer%20their%20I/O%20independently%2C%20which%20may%20be%20considerably%20faster%20in%20some%20cases.) which could significantly increase the speed of IO operations. – Fareanor Dec 07 '22 at 10:01
  • 3
    @KonradRudolph IIRC, `System.out` is line-buffered by default, so `System.out.println()` will flush the buffer on every call. – Peter Dec 07 '22 at 10:01
  • @463035818_is_not_a_number please how would I do that, thanks – Atolz Dec 07 '22 at 10:03
  • 1
    @Peter Yeah that's true. Still, there's an extra function call in the C++ code which is absent in the Java code. Admittedly with IO bound code this should be completely negligible but if OP *does* see a difference it must come from *somewhere* (well, C++ IO also does some additional unnecessary things to sync with C IO). – Konrad Rudolph Dec 07 '22 at 10:04
  • 2
    Replacing `endl` with `\n` may increase the performance because `endl` flushes the buffer. – Arman Oganesyan Dec 07 '22 at 10:09
  • Does this answer your question? ['printf' vs. 'cout' in C++](https://stackoverflow.com/questions/2872543/printf-vs-cout-in-c) – sinclair Dec 07 '22 at 10:09
  • @ArmanOganesyan Already discussed in the comments above. – Konrad Rudolph Dec 07 '22 at 10:10
  • @Fareanor waw, I tried it( std::ios::sync_with_stdio(false) ), and it's working, thanks; can you explain why it's improved; from the docs, does the synchronization means that the c++ out stream calls the c out stream? thus bringing an overhead? ...the code now runs in 7 Sec – Atolz Dec 07 '22 at 10:26
  • @KonradRudolph yeah, it seems there are some syncing causing the slowdown – Atolz Dec 07 '22 at 10:29
  • 1
    @Atolz You should read the [given answer](https://stackoverflow.com/a/74715010/11455384), it's pretty good and clear :) – Fareanor Dec 07 '22 at 10:31
  • @ArmanOganesyan waw, thanks, using ```\n``` instead of ```endl``` improved it further, now at 1.5 sec.....what an improvement; turning off synchronization and substituting with ```\n``` did it, thanks you all – Atolz Dec 07 '22 at 10:39
  • @Atolz -- it's unfortunate that `std::endl` has become the default line-ending for many programmers. It's almost never appropriate. As you've seen, `'\n'` ends a line. – Pete Becker Dec 07 '22 at 14:26

1 Answers1

6

There are several relevant differences between your C++ and Java code:

  1. By default C++ IO streams synchronise their state with the underlying C streams. This takes time. To avoid this (which you can do only if you know that your code does not mix C and C++ IO operations!), add the following to the beginning of your main code:

    std::ios_base::sync_with_stdio(false);
    
  2. cout << endl; is equivalent to cout << "\n" << flush; (which, in turn, is equivalent to cout << "\n"; cout.flush();). The flush call is absent from your Java code. You could add it to your Java code or, better, remove it from your C++ code: you almost never need to use endl/flush. Instead, just use

    cout << "Hello World\n";
    

    As noted by Peter in the comments, most systems flush the stdout stream on newline anyway (at least when attached to a terminal) so one might expect this not to make a difference. However, it does make a (substantial!) difference e.g. when piping the output to a file.

  3. Your Java benchmark code truncates fractional seconds. To show those fractions of seconds (relevant since the code runs in <1s!), change the relevant line to

    System.out.println(dur / 1000.0);
    
  4. Be sure to compile your C++ code with optimisations enabled; with GCC/clang/ICC, you do this by passing -O2. MSVC has a similar flag, /O2 (there are higher optimisation levels but they have particular issues; -O2 is pretty much the default setting people use).

  5. Conversely, java first.java will first compile the code every time you invoke it. To make the comparison fair, be sure to run javac first.java ahead of time, and then execute the code via java first.

Making these changes causes the C++ code to overtake the Java code on my system. This is most noticeable when increasing the loop size from 100,000 to 1,000,000: the C++ code now runs in milliseconds, while the Java code takes several seconds (be sure to pipe the output to a file! Otherwise you will be purely measuring the latency/rendering speed of your terminal, not the performance of the code).

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • 1
    If you want to show off the capabilities of an ahead-of-time compiler, `-O3 -march=native` will auto-vectorize loops where that looks useful, and use ISA extensions supported by your CPU. Of course this code probably only needs `-O1` since it spends all its time calling a library function, `iostream::operator<<(const char*)`; a tight loop around that is trivial for a compiler to optimize well enough. But yeah, `-O2` is a reasonable minimum for benchmarking, and somewhat realistic since a lot of code does get built with that option and without any `-march` beyond baseline x86-64 or AArch64. – Peter Cordes Dec 07 '22 at 15:02