-2

I just wrote a C++ code to list all the directories in a folder recursively. I'm using Boost Filesystem and I have build the following code statically:

#include <iostream>
#include <boost/filesystem.hpp>
using namespace boost::filesystem;
using namespace std;

int main(int argc, char* argv[]) {
    const path current_file_path(current_path());

    try {
        std::vector<path> directories_vector;
        for (auto&& x : recursive_directory_iterator(current_file_path))
            if (is_directory(x.path()))
                directories_vector.push_back(x.path());
        /* GETTING UP TO HERE TAKES MORE TIME THAN PYTHON OR RUBY*/

        for (auto&& x : directories_vector) {
            cout << x << '\n';
        }
    }
    catch (const filesystem_error& ex) {
        cout << ex.what() << '\n';
    }

    cin.get();
    return 0;
}

I wanted to see how fast this code would work against Python & Ruby. I know I/O related stuff are not good for evaluating code performance but when I run the C++ executable, it takes nearly 3 seconds for 15+ recursive folders while the following Python & Ruby codes are run nearly instantly:

Ruby:

Dir.glob("**/*/")

Python:

[x[0] for x in os.walk(directory)]

All of the codes are running on an SSD. I'm using Visual Studio 2017, Python 3.5.2 and Ruby 2.4 on Windows. The C++ code is using Release/x64 mode and Optimization is set to Maximum Optimization (Favor Speed) (/O2).

Why is the C++ code slower when faced with lots of recursive folders?

Cypher
  • 2,374
  • 4
  • 24
  • 36
  • Are you using an optimized build in c++.?I have seen cases where debug builds took 100 times longer (a few seconds in Release over a day in debug) than release in Visual Studio. – drescherjm Nov 02 '18 at 15:21
  • @drescherjm It's in Release/x64 mode & Optimization is set to : Maximum Optimization (Favor Speed) (/O2) – Cypher Nov 02 '18 at 15:22
  • Is this repeatable? – drescherjm Nov 02 '18 at 15:23
  • why are you calling `cin.get`? – Sam Mason Nov 02 '18 at 15:24
  • @SamMason Just so that I can get one last look at the results before the console closes. – Cypher Nov 02 '18 at 15:24
  • @drescherjm I don't quite get what you mean by repeatable. – Cypher Nov 02 '18 at 15:26
  • @Cypher OK, just making sure you're not including that in your timing… – Sam Mason Nov 02 '18 at 15:27
  • ***I don't quite get what you mean by repeatable.*** Does it happen the more than 1 time in a row? I was worried that the results could possibly be skewed by the OS cache or interaction of the AV. If you ran `c++` first timed it 1 time then ran the other 2. – drescherjm Nov 02 '18 at 15:28
  • 1
    @drescherjm Yes... I had this suspicion but no matter how many times you run the C++ code, it always takes nearly the same amount of time it did the first time. I'm guessing some bootstraping and initializations are taking time. – Cypher Nov 02 '18 at 15:30
  • I think it would be interesting if we had full minimal examples (with timing code) for `c++` and at least one of the other 2. – drescherjm Nov 02 '18 at 15:36
  • @drescherjm I'm new to C++ and I'm not familiar with the best approach to time the loop. any recommendations? – Cypher Nov 02 '18 at 15:49
  • Your `c++` code looks fine to me. Not sure why it is slower. – drescherjm Nov 02 '18 at 15:50
  • @drescherjm thanks for your persistence, at least I know the code isn't badly written. – Cypher Nov 02 '18 at 15:55
  • Here is another similar question (it does not answer the question however it does give some advice): https://stackoverflow.com/questions/13897401/boost-filesystem-incredibly-slow – drescherjm Nov 02 '18 at 16:08
  • 2
    By the way, this question shows again why statements such as "language X is slower than language Y" are complete and utter nonsense. – Jörg W Mittag Nov 02 '18 at 16:44
  • FYI. I tried it on Ubuntu on the Ruby source tree with 5600 files: C++ version 0.045s, Ruby version 0.070s. – Casper Nov 02 '18 at 17:32

1 Answers1

3

By running both the C++ version and the Ruby version with strace we can get some clues why the C++ version is slower.

Using the Linux source code for testing (65000 files):

strace -o '|wc' cpp_recursion
  86417  518501 9463879

strace -o '|wc' ruby -e 'Dir.glob("**/*")' 
  30563  180115 1827588

We see that the C++ version does almost 3x more operations than Ruby.

Looking more closely at the strace output you will find that both programs use getdents to retrieve directory entries, but the C++ version runs lstat on every single file, while the Ruby version does not.

I can only conclude that the C++ version is not implemented as efficiently (or it possibly serves a different purpose) as the Ruby version. The speed difference is not a language issue, but an implementation issue.

N.B. The C++ version with -O optimization runs in 0.347s, while the Ruby version runs in 0.304s. At least on Linux lstat seems to not incur much overhead. Perhaps the situation is different on Windows.

Casper
  • 33,403
  • 4
  • 84
  • 79
  • 1
    ***Perhaps the situation is different on Windows.*** I think the overhead is larger in windows versus linux. – drescherjm Nov 02 '18 at 18:50
  • @drescherjm Yeah, seems like a good guess. OP's "nearly instant" vs 3 seconds is quite a large difference, which is hard to explain otherwise. – Casper Nov 02 '18 at 18:59
  • @Casper Thanks so much for your thorough and detailed answer. By the way, is there any other library than boost that can give higher speeds? – Cypher Nov 02 '18 at 19:07
  • Maybe try this with Visual Studio 2017: https://stackoverflow.com/questions/50668814/vs2017-e0135-namespace-std-has-no-member-filesystem – drescherjm Nov 02 '18 at 19:26
  • @Cypher I'm not sure. You might look through the answers here: https://stackoverflow.com/questions/67273/how-do-you-iterate-through-every-file-directory-recursively-in-standard-c/32889434 . Seems there is a way to do it with the std library now too, but have no info on its speed. – Casper Nov 02 '18 at 19:42
  • @drescherjm I rewrote the code with C++ 17 and std::filesystem (which has an exact API as boost) and the results are the same as back then. It really seems that Windows' implementation is doing something slow compared to Linux. Thanks again. – Cypher Nov 02 '18 at 19:48
  • ***the results are the same*** I was hoping that Microsoft found an optimization. – drescherjm Nov 02 '18 at 20:00