13

Let's say you have an std::unordered_set<std::string> .

You have an std::string_view object that you want to search for in the container. Problem is, you don't want to create a std::string from your std::string_view, as this kind of defeats the purpose of using std::string_view in the first place.

However, it seems that std::string_view should be usable as a key; there should be some way to compare std::string_view and std::string , as they both basically represent the same thing. But there isn't, not in the STL anyway.

This is an impasse, am I forced to write my own comparison object for std::string_view and std::string objects to use with my std::unordered_set ?

edit: This question is specific to string_view objects. The 'duplicate' question is not relevant. I received a unique answer to a unique question, as expected.

  • 1
    The property you are looking for is commonly called "transparent lookup", and I fear nothing has changed since that answer. :( – Max Langhof Aug 08 '18 at 14:45
  • 1
    @MaxLanghof: [That question](https://stackoverflow.com/q/49709548/364696) was limited to C++11, and covered working with a `char*` and a length. `std::string_view` is a C++17 feature, so even if there were solutions for that case, that question is unlikely to attract them. One would hope the invention of `string_view` included addressing this sort of use case (though I wouldn't be surprised if it didn't). – ShadowRanger Aug 08 '18 at 14:53
  • 1
    @ShadowRanger Unless `std::unordered_map/set` were specialized for a key type of `std::string`, the answer given there quite clearly says that heterogeneous lookup for these types is (and will, for the near future, be) unavailable. I concede that the former condition is not clearly stated in the duplicate, but I had no trouble finding the information I would need in order to conclude as much from simply googling the question title. – Max Langhof Aug 08 '18 at 15:02
  • 1
    @ShadowRanger More to the point, the best answer I could give to the question here would be a verbatim copy of the one in the duplicate (well, its first two paragraphs). – Max Langhof Aug 08 '18 at 15:08
  • 1
    @MaxLanghof: Wait, I'm confused. I thought C++14's heterogeneous lookup was designed for almost exactly this particular case? Why doesn't that work here? https://stackoverflow.com/questions/31923715/how-can-i-search-an-stdmap-using-a-key-of-a-different-type – Mooing Duck Aug 08 '18 at 16:47
  • 1
    @MooingDuck I initially wrote an answer that demonstrated just that, but then I realized the question is about **unordered** (i.e. hash) maps/sets. Those do not have that feature, sadly. – Max Langhof Aug 08 '18 at 17:01
  • @ShadowRanger the fact that `string_view` is a C++17 feature is irrelevant, the other question is about heterogeneous lookup in general, not just for `string` vs `string_view`. The standard library aims for generic solutions, not adding a special case just for strings in maps. Everything that answer says is still relevant: it works for ordered maps and sets in C++14, (e.g. with `experimental::string_view` and `string`) but doesn't work with unordered containers. `boost::unordered_map` supports it, so OP doesn't need to write their own type. – Jonathan Wakely Aug 10 '18 at 12:58

1 Answers1

5

I don't have a nice solution, but a possible workaround with minimal custom code, at the expense of increased memory usage, would be to replace your std::unordered_set<std::string> with a std::unordered_map that has views for keys, and strings for values (which back the views).

Unfortunately, thanks to the small string optimization, we can't rely on std::move preserving the original address of the underlying string's data, so something like:

std::string to_insert(...);
mymap.try_emplace(to_insert, std::move(to_insert));

isn't going to work.

Instead, it would have to be a std::unordered_map<std::string_view, std::unique_ptr<std::string>> so we could preserve the unique address of the string's characters, making the code more like:

auto to_insert = std::make_unique<std::string>(...);
mymap.try_emplace(*to_insert, std::move(to_insert));

While insertion would be kinda ugly, simple membership testing would remain simple, since std::string defines an implicit operator std::string_view, and std::string_view has an implicit constructor for char*, so membership testing remains a simple:

if (mymap.count(some_string)) { ... }

whether some_string is a char*, std::string_view or std::string.

Note: I'm not going to swear the two-liner try_emplace-based insertion code is legal, as I'm a bit out of practice on C++, and quite leery about using a unique_ptr in the same expression I'm moveing from it; on g++ 7.2 it seems to work, and I think the fact that the key argument to try_emplace is constructed immediately, while the arguments to construct the value are forwarded makes this safe, but I'll acknowledge that my understanding of C++ evaluation order (or lack thereof) is not perfect. If I'm doing something illegal, not just ugly, then fixing it would require the slightly uglier (but definitely sequenced):

auto to_insert = std::make_unique<std::string>(...);
std::string_view key{*to_insert};
mymap.try_emplace(std::move(key), std::move(to_insert));

Additional note: Only the emplace/emplace_hint/try_emplace functions can be safely used to update the entries in mymap in this design. Using mymap[key] = std::move(to_insert); or insert_or_assign breaks if the same key is encountered twice while building the map, as the original string_view (referencing the original string's data) would be preserved, while the value would be replaced with the new string, invalidating the string_view's pointer. While insert doesn't replace values, I believe using it would require a design more like the three-liner with try_emplace, as making the std::pair to insert would be unsequenced if you tried to construct both view and unique_ptr as part of pair construction.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
  • Note: Having read the responses to [my question](https://stackoverflow.com/q/51752143/364696) and thought about it some more, while `mymap.try_emplace(*to_insert, std::move(to_insert));` is definitely legal/safe (thanks to `try_emplace`s eager evaluation of the key, but lazy construction of the value), [it's code smell](https://stackoverflow.com/questions/51752143/is-stdmove-safe-in-an-arguments-list-when-the-argument-is-forwarded-not-move#comment90464630_51752360), so I'd recommend using the solution that splits `string_view` conversion from `try_emplace`, even if it's trivially slower. – ShadowRanger Aug 08 '18 at 17:55
  • In my last comment, I suggested splitting `string_view` conversion from `try_emplace` might be "trivially slower". I tested this, and it turns out the output produced by `g++` 7.2.0 for my test case was identical under `-Os`, and almost identical (to the point where I'm fairly sure it would perform identically, since the modified code paths looked to be involved in exception handling and never run on success) under `-O3`. Point is, at the expense of one additional line of code, and no cost to performance, you can avoid code smell, so do it; your maintainer will thank you for it. :-) – ShadowRanger Aug 08 '18 at 18:58