1

I have a 2GB file. An average line has 15 char (max 50). When using:

#include <iostream>
#include <fstream>
#include <string>

void main()
{
    std::ifstream input("myfile.txt");
    std::string line;
    while (std::getline(input, line))
    {
    }
    return;
}

it takes ~ 320 seconds, whereas this Python code:

with open('myfile.txt', mode='r') as f:
    for l in f:
        semicolonpresent = ';' in l        # do anything here, not important

takes less than one minute.

What is wrong in the C++ version I'm using?

Note: I've tried both many times, each one after a fresh reboot, or after many previous runs (so it might be in I/O cache), but I always get such order of magnitudes.

Note2: I compiled the C++ code on Windows 7/64, using:

call "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\vcvarsall.bat" x86
cl main.cpp
Basj
  • 41,386
  • 99
  • 383
  • 673
  • Did you compile with optimization turned on? – Galik Jan 07 '18 at 16:06
  • @Galik I edited question with compilation lines – Basj Jan 07 '18 at 16:07
  • well your python code has very little similarity with the cpp code. if you wat to compare performance, i recommend have them do the same thing. also using the c functions may be faster – JoshKisb Jan 07 '18 at 16:10
  • It is `int main() {....return 0;}` – Killzone Kid Jan 07 '18 at 16:12
  • @JoshKisb Why little similarity? I used in both cases the natural way to read a file line-by-line. `with open('myfile.txt', 'r') as f: for l in f: ...` for Python, and this for C++: https://stackoverflow.com/a/7868998/1422096 – Basj Jan 07 '18 at 16:13
  • 1
    `cl main.cpp` lacks most of fancy arguments that can be used by compiler. You better create program using default VS console application template. And don't forget to select `Release` configuration when building. – user7860670 Jan 07 '18 at 16:15
  • I can't reproduce these results using `Linux`and `GCC 7.2`,in my tests `C++` is much faster (about 5 times faster over apprx `1GB`) – Galik Jan 07 '18 at 16:22
  • Did you try comparing the times just opening the large file in C++ and Python and not reading the lines ? – StPiere Jan 07 '18 at 16:22
  • @VTT I'm usually using command-line + text-editor only, so I'm not familiar about VS, and these `cl` arguments. Which are they usually? – Basj Jan 07 '18 at 16:29
  • I recall a very similar question about ruby in the last few days, it turned out to be a difference between optimized and non-optimized build. As soon as the c++ compiler was set to optimize the difference went away. – SoronelHaetir Jan 07 '18 at 16:29
  • With more accurate timing and programing on my system `C++` is about `3` times faster over `1GB` but over `7GB` they are about the same (with `C++` in the lead by a few seconds). – Galik Jan 07 '18 at 16:37
  • Actually this task is not that good for performance comparison because it is mostly io-bound. python technically can call the same C functions as C++ to perform actual reads, while nobody really uses iostreams to read files in C++. – user7860670 Jan 07 '18 at 16:48
  • 1
    @VTT `nobody really uses iostreams to read files in C++` then I was misinterpreting https://stackoverflow.com/questions/7868936/read-file-line-by-line/7868998#7868998. How would you read line-by-line a big file in C++? – Basj Jan 07 '18 at 16:50
  • 1
    I mean that nobody really uses them in real life applications, not that they can not be used at all. See [What serious alternatives exist for the IOStream library](https://stackoverflow.com/questions/6171360/what-serious-alternatives-exist-for-the-iostream-library-besides-cstdio) for example. The lack of proper io routines in C++ is a serious language pitfall that forces devs to look for alternatives. Unfortunately standard committee is too busy with adding stuff like emojis syntax or Riemann zeta functions. I would use home-brewed mmap or async reading wrappers. – user7860670 Jan 07 '18 at 16:59
  • @VTT We'd love to prove Riemann hypothesis with C++ code :) – Basj Jan 07 '18 at 17:02

3 Answers3

1

In my tests C++ is not slower than Python. Over a large binary file they are about the same.

I modified your code a little to provide what I feel should be accurate timings.

Python code:

import sys
import time 

start = time.time()
sum = 0
with open(sys.argv[1], mode='r') as f:
    for l in f:
        sum = sum + 1
end = time.time()

print "(", sum, ")", (end - start), "secs"

C++ code:

    auto start = std::chrono::steady_clock::now();

    std::ifstream ifs(argv[1]);
    std::string line;

    unsigned sum = 0;
    while(std::getline(ifs, line))
        ++sum;

    auto end = std::chrono::steady_clock::now();

    auto time = double(std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count());

    OUT("( " << sum << " ) " << (time / 1000.0) << " secs");

Processing a 7.1 GiB binary video file gives:

Python: ( 32547618 ) 62.9722070694 secs
C++   : ( 32547618 ) 63.368 secs

The C++ code was compiled on GCC v7.2 using the following flags:

g++ -std=c++17 -O3 ...

By contrast unoptimized code ran in 68.541 secs

Galik
  • 47,303
  • 4
  • 80
  • 117
  • Thanks for your answer. Can you add details about how you compiled (compilation parameters, optimization, etc.), because I noticed it changed a lot the timing. – Basj Jan 07 '18 at 19:10
  • @Basj Sure I added the optimization level. In my tests they didn't make much difference. It seems the larger the file the less the difference between different methods. I would expect this because the most time is spent waiting for the disk to spin :) – Galik Jan 07 '18 at 19:20
0

void main isn't valid C++. Neither is an empty return in main(). Also, with optimizations turned on, your while loop will be optimized away. With that said...

You should notice a speed improvement if you turn optimizations on. However, streams in C++ are slow. This is not news to the C++ world. So why is Python much faster? The two are not really equivalent. Python is written in C and internally delegates to C library functions. Further, it's likely that Python is allocating a buffer in advance and copying the buffer in one go. If you wanted to do something similar in C++:

std::ifstream ifs(filename);

auto end = ifs.tellg();
ifs.seekg(0, std::ios::beg);
auto beg = ifs.tellg();

std::string buffer;
buffer.reserve(end - beg);
std::copy(std::istreambuf_iterator<char>(ifs),
          std::istreambuf_iterator<char>(),
          buffer.begin());
  • How to turn optimization on with `cl helloworld1.cpp` ? – Basj Jan 07 '18 at 16:45
  • Doing `buffer.reserve(end - beg);` might not end well if there isn't enough spare ram to allocate solid 2 GB block. – user7860670 Jan 07 '18 at 16:45
  • @Basj [`cl` optimization flags](https://learn.microsoft.com/en-us/cpp/build/reference/o-options-optimize-code). A basic one would be `cl /O2 helloworld1.cpp`, though by default I'm pretty sure it already uses `/Ot`, which uses a set of speed optimizations. – wkl Jan 07 '18 at 16:45
  • In my tests streams are not slow, they perform basically the same as the underlying `OS` calls (`Linux` systems). – Galik Jan 07 '18 at 16:48
  • @VTT The worst that can happen is an exception is thrown. – user9183966 Jan 07 '18 at 16:48
  • @user9183966 The idea of reading line-by-line (instead of reading the whole file in memory) is to avoid to load a big space in RAM. Let's say each line just updates an `int` state variable in the loop. Then the whole thing shouldn't should more than 1 MB in RAM, even if the file is 2 GB. So I don't really see why you're using a buffer, reserve, etc. – Basj Jan 07 '18 at 16:52
  • 1
    @Galik *In my tests streams are not slow, they perform basically the same as the underlying OS calls (Linux systems)* Get faster disks and a system with more memory bandwidth. If you're running commodity hardware, disk IO seek times are likely very poor, and memory bandwidth is likely limited. That can hide the overhead of streams processing. Most computer buyers don't look past the CPU and the amount and speed of system RAM. Even for buyers who look at the drive(s), very few look at the disk controller. Streams processing likely stays entirely in the CPU and the on-board cache, masking it. – Andrew Henle Jan 07 '18 at 17:02
0

(Partial) answer:

As mentioned by many people in comments, cl optimization helped:

cl /Ox helloworld1.cpp

divided the runtime by 6!

Basj
  • 41,386
  • 99
  • 383
  • 673