2

This is a follow up question to Char* vs String Speed in C++. I have declared the following variables:

std::vector<std::string> siteNames_;
std::vector<unsigned int> ids_;
std::vector<std::string> names_;

I call this function tens of thousands of times and is a major bottleneck. Is there a more efficient way to compare strings? The answer must be cross-platform compatible.

unsigned int converter::initilizeSiteId(unsigned int siteNumber){
    unsigned int siteId = 0;
    for (unsigned int i = 0; i < ids_.size(); i ++){
        if (siteNames_[siteNumber].compare(names_[i]) == 0){
            siteId = ids_[i];
            break; // Once found, will stop searching and break out of for loop
        }
    }
    if (siteId == 0)
        std::cerr << "Could not find ID for site number " << siteNumber << std::endl;

    return siteId;
}
Community
  • 1
  • 1
Elpezmuerto
  • 5,391
  • 20
  • 65
  • 79
  • 1
    I don't see a definition for `names_`. In any case, if you're doing lookup why don't you use a `std::map` or `std::/boost::/tr1::unordered_map` instead of a linear search? That's what they're for. – GManNickG Oct 21 '10 at 22:34
  • @GMan, typo of the declarations sorry. I have never used `std::map` or that particular boost function. I will look into those. Would you be able to provide an example please? – Elpezmuerto Oct 21 '10 at 22:35
  • How big is siteNames_? If you can guarantee that it will always be in sorted order, then you could try a binary search. And to really optimize, store siteNames_[siteNumber] in a temporary variable to prevent looking it up constantly. EDIT: To echo the other comments, a map would handle binary search for you. – chrisaycock Oct 21 '10 at 22:38
  • @Chrisaycock, unfortunately siteNames_ is not in sorted order – Elpezmuerto Oct 21 '10 at 22:49

2 Answers2

6

Use a map or unordered map instead. Then you can do this:

std::map<string, int>names_;
// ...

unsigned int converter::initilizeSiteId(unsigned int siteNumber){
    unsigned int siteId = 0;
    std::map<string, int>::iterator i = names_.find(siteNames_[siteNumber]);
    if (i != names_.end()){
        siteId = i->second;
    }
    else (siteId == 0)
        std::cerr << "Could not find ID for site number " << siteNumber << std::endl;

    return siteId;
}

This will perform in O(log n) time rather than the O(n) you had before.

There are other options if you have a sorted list, such as binary search.

JoshD
  • 12,490
  • 3
  • 42
  • 53
  • I am getting errors when I try to compile this code, I also have tried `std::map::iterator i` and that fails. What is mapiterator? – Elpezmuerto Oct 21 '10 at 22:47
  • @Elpezmuerto: I've updated it a bit. You need template parameters, and `names_` needs to be a map, and it should match your names to the ids. So `names_["bob"] = 2` for example. – JoshD Oct 21 '10 at 22:49
  • @Elpezmuerto: Well, what are the errors and what is your code? Also, this should be covered in a C++ book, [do you have one](http://stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list)? – GManNickG Oct 21 '10 at 22:50
  • @GMan, I tend to favor C++ Primer, 5th Edition http://www.amazon.com/Primer-Plus-5th-Stephen-Prata/dp/0672326973 – Elpezmuerto Oct 21 '10 at 22:54
  • @Elpezmuerto: well surely that book has a section about maps. Reading through that will give you a lot of very useful knowledge. I highly recommend it. – JoshD Oct 21 '10 at 22:56
  • Actually maps are barely covered, they only have two pages in Appendix G and thats it, hence probably why I am stuck now :p – Elpezmuerto Oct 21 '10 at 22:59
  • @JoshD, I know this is asking alot, but can you provide a map solution to http://stackoverflow.com/questions/3989111/char-vs-string-speed-in-c. `names_.insert(std::pair(name,id));` isn't working and not sure why – Elpezmuerto Oct 21 '10 at 23:00
  • @Elpez: Ah, you need to be more careful. The title is C++ Primer Plus, not C++ Primer. It matters, especially in this case since C++ Primer is it's own book. In that book (the one you linked), turn to page 1101. Unfortunately, that's a pretty bad book, which is why we don't have it on the list. It's always a bad thing having a bad book. :( – GManNickG Oct 21 '10 at 23:01
  • @Gman, haha that it is, I will have to get some of the recommended books from SO. I tend to use www.cplusplus.com before anything else – Elpezmuerto Oct 21 '10 at 23:02
  • @Elpezmuerto: if that snippet of code is accurate, then you misspelled `int`. – JoshD Oct 21 '10 at 23:05
  • @JoshD, hahaah that was it! I feel very relieved that I am not completely worthless. Thanks so much, I have seen a significant boost in performance! – Elpezmuerto Oct 21 '10 at 23:08
0

If you often look up just a few different siteNumber and call it enough times it could be worthwile to implement a cache to store the latest siteNumber:s. Although since you're only working in memory and not to/from disk I doubt it.

dutt
  • 7,909
  • 11
  • 52
  • 85