2

I have created Andorid project to measure some equivalent pices of code and this one had me stuck on why? Why is C++ 3x slower here?

I already made some tweaks to C++ as immediate try to emplace and then push_back was slower than current approach, but still. Complexities here should be same, right?

Kotlin: 139,945,691 ns (139 ms)

C++:   347,100,764 ns (347 ms)

Kotlin:

data class Record(
    val year: Int,
    val month: Int, 
    val day: Int, 
    var temperature: Double
)

val records = ArrayList<Record>()
val map = HashMap<String, ArrayList<Record>>()

records.forEach { map.getOrPut("${it.year}-${it.month}") { ArrayList() }.add(it) }

C++

typedef struct record {
    int year;
    int month;
    int day;
    double temperature;
} Record;

std::vector<Record> records;
std::unordered_map<std::string, std::vector<Record>> map;

for (const auto &record : records) {
    const std::string & key = std::to_string(record.year) + " " + std::to_string(record.month);
    const auto & it = map.find(key);

    if (it == map.end()) {
        map.emplace_hint(it, key, std::vector<Record>())->second.push_back(record);
    } else {
        it->second.push_back(record);
    }
}

// Edit

Broader C++ code: https://pastebin.com/KqD02pSD

Broader Kotlin code: https://pastebin.com/iG7hCqHT


Important Edit

I have changed key of maps to Int - [year * 100 + month]. And results are still similar; 3x slower.

Simson
  • 3,373
  • 2
  • 24
  • 38
Majkeee
  • 960
  • 1
  • 8
  • 28
  • 4
    Which compiler settings did you use for optimizing the c++ code? – πάντα ῥεῖ Mar 16 '19 at 13:47
  • 1
    For the C++ snippet, the loop body can be reduced to a single line: `map[std::to_string(record.year) + " " + std::to_string(record.month)].push_back(record);`. – HolyBlackCat Mar 16 '19 at 13:53
  • @πάνταῥεῖ -0fast, but compared to (default?) -O0 optimization there was very little difference. – Majkeee Mar 16 '19 at 13:54
  • @HolyBlackCat thanks, didnt know [] would insert default value. But sadly no, its actually a little slower. – Majkeee Mar 16 '19 at 13:59
  • 1
    Huh. Then can you post complete benchmarks (with all includes, with time measuring functions added, and so on), so we can reproduce your results? – HolyBlackCat Mar 16 '19 at 14:03
  • @HolyBlackCat cannot give everything, but i edited question with hopefully enough code. – Majkeee Mar 16 '19 at 14:16
  • Now, change the unordered map's key to `std::tuple` (with the appropriate adjustments to the rest of the code, with [a simple custom hash function](https://stackoverflow.com/questions/7110301/generic-hash-for-tuples-in-unordered-map-unordered-set)) and try again. The shown code is torturing precious electrons by useless integer to string conversions, and even more useless string concatenations, that don't accomplish anything useful, except to produce a string for comparison purposes; which is something that can be easily accomplished by keeping everything as original `int`s. – Sam Varshavchik Mar 16 '19 at 14:19
  • 2
    @SamVarshavchik while what you say isn't wrong were this about c++ only; this is about the comparison between java and c++; and the key should be the same for both maps; else you're comparing apples and eggs. – UKMonkey Mar 16 '19 at 14:26
  • 2
    No @UKMonkey , it's comparing literal identical code that's the apples vs oranges comparison. Java and C++ are fundamentally different languages. They are optimized in different ways, and with different techniques. Hobbling native C++ and torturing it into hobbling along like Java proves nothing. Take a native C++ implementation, that takes full advantages of the unique features of the C++ language, and try to cobble together a Java program that works literally the same way, and then evaluate how meaningful the results are. – Sam Varshavchik Mar 16 '19 at 14:30
  • @SamVarshavchik I will do it later for sure. But it is not purpose of this question nor test. Yes, it makes no sense to have exactly same code for Java and C++ as C++ can do stuff much better with lower API, however, in this simple scenario I am just asking what did I miss that Java is faster? – Majkeee Mar 16 '19 at 14:32
  • Java trades away bigger runtime memory footprint for performance. It preallocates dynamic memory like crazy. When it sees that a string concatenation is on the horizon, it estimates and allocates enough memory it thinks the concatenated string will need, hopefully just once. The C++ equivalent does piecewise dynamic heap allocation, by each overloaded `+` operator. Now, try declaring a single `std::string`, having it `reserve()` enough memory for the concatenated string, then manually `append()` each component of the aggregate string key into the string. ***That***'s the logical equivalent. – Sam Varshavchik Mar 16 '19 at 14:38
  • @SamVarshavchik Alright, please see my edit then. – Majkeee Mar 16 '19 at 15:00
  • 1
    There's nothing to see here. All questions on stackoverflow.com must include all pertinent information ***in the question itself, as plain text***. Dodgy links to suspicious external web sites, that can stop working and make the question meaningless, are off-topic. – Sam Varshavchik Mar 16 '19 at 15:05
  • @SamVarshavchik i meant the last one, that i removed all string, so all your points are invalid. – Majkeee Mar 16 '19 at 15:21
  • Two suspects: 1) memory allocation speed 2) C++'s hashtable is usually implemented in a not-too-good way: bucket index is calculated by a modulo (instead of an `and`). – geza Mar 16 '19 at 15:32
  • To reiterate, please post _all_ relevant code here, including how you are doing microbenchmarks. The question becomes pointless without the links. – Passer By Mar 16 '19 at 15:53
  • "The last one" shown in this question still shows all the same, useless string operations. I'm definitely wasting my time on this question. Good luck with Java, hope it works out for you. – Sam Varshavchik Mar 16 '19 at 16:00
  • Where is the data so we can compare? – Martin York Mar 16 '19 at 18:49

1 Answers1

1

It's always hard to say why exactly a certain piece of code performs the way it does without properly profiling. Especially when talking about C++ where there's no standard implementation and, thus, the quality of your compiler and library implementation can vary significantly depending on which toolset you're using (examples: 1, 2). Furthermore, the interface that the standard requires for unordered_map, unfortunately, severely limits what the implementation underneath can do (check out this talk for more about that). It is generally known that std::unordered_map is, unfortunately, not necessarily the best hash table one could hope for (some more on that).

Apart from the performance issues of std::unordered_map itself, there's a few minor (most likely insignificant) things that may be putting your C++ code at an additional disadvantage here. First of all, you're building the key string and then copy it into the map. It would likely be more efficient to move the string. Also, std::to_string is potentially more costly than the conversion performed by Kotlin string interpolation because std::to_string is forced to observe the current locale while Kotlin string interpolation converts to a fixed format. It generally seems rather wasteful to use strings as keys here, as also already pointed out in the comments to your question.

I would suggest to use map.try_emplace() rather than map.emplace_hint() and std::to_chars() instead of std::to_string(). Apart from that, I wouldn't be surprised if the Kotlin HashTable is just a generally more efficient container than std::unordered_map could ever be due to the limitations set by its interface…

All that being said, I'm not sure what exactly you're attempting to achieve with this benchmark. At the end of the day, what you're doing here is compare the performance of two randomly chosen hash table implementations. You will never be able to pick one over the other as they exist in two completely separate ecosystems, so whatever the results of this benchmark are, they don't seem very…useful!?

Michael Kenzel
  • 15,508
  • 2
  • 30
  • 39
  • I changed the key to Int (year * 100 + month) and got same results. So if nobody has any other idea I will have to acknowledge that unordered_map is not as good as Kotlin's HashTable. – Majkeee Mar 16 '19 at 14:43
  • It likely isn't due to the reasons given in my answer above. I'm not sure what exactly that would show, but yes… – Michael Kenzel Mar 16 '19 at 14:46
  • @MichaelKenzel All you can "acknowledge" is that you are comparing badly written C++ against optimal Java. We can guess at why you don't get an optimal C++ result but unless we have all the information to reproduce your results then all we can do is guess. – Martin York Mar 16 '19 at 18:56
  • @MartinYork Huh? I am comparing fundamental basics of programming in two languages. Hash table in Java and C++. We all know C++ is by nature faster, so the question was what's wrong with std's hash table. All the relevant code is in question and even more code is in linked pastebin. – Majkeee Mar 16 '19 at 19:58