1

I have some string comparison logic in my program, e.g:

std::unordered_set<std::string> relational_operators{
    "==",
    "!=",
    ">",
    "<",
    ">=",
    "<="
};

bool is_relational(std::string token) {
  relational_operators.contains(token);
}

if (is_relational(token)) {
  // ...do stuff
}

All values of set are known at compile time, but checks will be done on user input. How such strings are usually stored in c++? I don't know if storing a set like this is a good idea, probably not because it can possibly throw an error when allocation happens (IDE warnings).

So for example if i had another set of strings (supported operators):

std::unordered_set<std::string> supported_operators {
  // ...
};

Support for new operators will be added over time. So i just want to add a new operator to a set. So basically i want to avoid situation like this:

bool is_supported_op(std::string token) {
  return token == "<" || token == ">" || token == "!="; // ... many more ||
}

east1000
  • 1,240
  • 1
  • 10
  • 30
senloa
  • 181
  • 1
  • 8
  • Does this answer your question? [How to deal with static storage duration warnings?](https://stackoverflow.com/questions/48938090/how-to-deal-with-static-storage-duration-warnings) – Nolan Sep 10 '21 at 06:31

1 Answers1

4

Give that you're apparently not planning to modify the set of strings at run-time, I'd probably use an std::array<std::string, N> to hold them, then use std::binary_search to do the search.

From a theoretical viewpoint, you get O(log N) lookups either way--but in reality, the array is likely to give enough better cache locality to improve performance quite a bit (especially if you're using a modern implementation of std::string that implements the short string optimization).

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • `std::binary_search` needs a sorted array, so do we need to manually keep the array sorted? – prehistoricpenguin Sep 10 '21 at 06:55
  • 1
    @prehistoricpenguin: You could either keep it sorted manually, or use a call to `std::sort` during initialization. I'd usually prefer the latter--more proof against boneheaded programmers (of which I'm often one). Theoretically wastes some time during startup, but given the number of items likely to be involved, not enough to notice. – Jerry Coffin Sep 10 '21 at 06:59
  • @康桓瑋: Yes, as long as you're sure it'll be small, `find` should be fine--but part of the question says "many more"--which might mean a dozen or might mean hundreds. If it's dozens, go with `find`. If it's hundreds or thousands, `binary_search` is probably better. – Jerry Coffin Sep 10 '21 at 06:59
  • 3
    If performance becomes critical, you can likely apply perfect hashing - statically generate a hash function and a lookup table, generating a .cpp file on the fly. It could also make sense to determine the maximum length, and skip a lookup for entries that are too long. (especially here, where operators are probably no longer than 3 characters and likely mixed with longer identifiers) – MSalters Sep 10 '21 at 07:13
  • @MSalters There exists a repo https://github.com/serge-sans-paille/frozen, to get constexpr unordered_map – prehistoricpenguin Sep 10 '21 at 07:23