29

Problem description:
Consider some structure having an std::string name member. For clearness let's suppose that it's a struct Human, representing information about people. Besides the name it can also have many other data members.
Let there be a container std::vector<Human> vec, where the objects are already sorted by name. Also for clearness suppose that all the names are unique.
The problem is: having some string nameToFind find out if there exists an element in the array having such name.

Solution and my progress:
The obvious and natural solution seems to perform a binary search using the std::binary_search function. But there is a problem: the type of the element being searched (std::string) is different from the type of the elements in the container (Human), and std::binary_search needs a rule to compare these elements. I tried to solve this in three ways, described below. First two are provided just to illustrate the evolution of my solution and the problems which I came across. My main question refers to the third one.

Attempt 1: convert std::string to Human.

Write a comparing function:

bool compareHumansByNames( const Human& lhs, const Human& rhs )
{
   return lhs.name < rhs.name;
}

Then add a constructor which constructs a Human object from std::string:

struct Human
{
   Human( const std::string& s );
   //... other methods

   std::string name;
   //... other members
};

and use the binary_search in following form:

std::binary_search( vec.begin(), vec.end(), nameToFind, compareHumansByNames );

Seems working, but turns up two big problems:
First, how to initialize other data members but Human::name, especially in the case when they don't have a default constructor ? setting magic values may lead to creation of an object which is semantically illegal.
Second, we have to declare this constructor as non explicit to allow implicit conversions during the algorithm. The bad consequences of this are well known.
Also, such a temporary Human object will be constructed at each iteration, which can turn out to be quite expensive.

Attempt 2: convert Human to std::string.

We can try to add an operator string () to the Human class which returns it's name, and then use the comparsion for two std::strings. However, this approach is also inconvenient by the following reasons:

First, the code will not compile at once because of the problem discussed here. We will have to work a bit more to make the compiler use the appropriate operator <.
Second, what does mean "convert a Human to string" ? Existence of such conversion can lead to semantically wrong usage of class Human, which is undesirable.

Attempt 3: compare without conversions.

The best solution I got so far is to create a

struct Comparator
{
   bool operator() ( const Human& lhs, const std::string& rhs )
   {
      return lhs.name < rhs;
   }
   bool operator() ( const std::string& lhs, const Human& rhs )
   {
      return lhs < rhs.name;
   }
};

and use binary search as

binary_search( vec.begin(), vec.end(), nameToFind, Comparator() );

This compiles and executes correctly, everything seems to be ok, but here is where the interesting part begins:

Have a look at http://www.sgi.com/tech/stl/binary_search.html. It's said here that "ForwardIterator's value type is the same type as T.". Quite confusing restriction, and my last solution breaks it. Let's see what does the C++ standard say about it:


25.3.3.4 binary_search

template<class ForwardIterator, class T>
bool binary_search(ForwardIterator first, ForwardIterator last,
const T& value);

template<class ForwardIterator, class T, class Compare>
bool binary_search(ForwardIterator first, ForwardIterator last,
const T& value, Compare comp);

Requires: Type T is LessThanComparable (20.1.2).


Nothing is explicitly said about ForwardIterator's type. But, in definition of LessThanComparable given in 20.1.2 it is said about comparsion of two elements of the same type. And here is what I do not understand. Does it indeed mean that the type of the object being searched and the type of the container's objects must be the same, and my solution breaks this restriction ? Or it does not refer to the case when the comp comparator is used, and only is about the case when the default operator < is used for comparsion ? In first case, I'm confused about how to use std::binary_search to solve this without coming across the problems mentioned above.

Thanks in advance for help and finding time to read my question.

Note: I understand that writing a binary search by hand takes no time and will solve the problem instantly, but to avoid re-inventing a wheel I want to use the std::binary_search. Also it's very interesting to me to find out about existence of such restriction according to standard.

Community
  • 1
  • 1
Grigor Gevorgyan
  • 6,753
  • 4
  • 35
  • 64
  • 1
    I don't have a copy of the C++98 or 03 standard, but my version of the C++0x standard says nothing about them needing to be LessThanComparable. Only that `operator<` be available for comparison on both sides (or in the case of the comparison functor version, `operator()`). So C++0x seems to have cleared up the ambiguity, and your `binary_search` implementation is ahead of the curve. – Nicol Bolas Aug 02 '11 at 22:42
  • Interesting question. Out of curiosity (and for posterity), when you said "This compiles and executes correctly, everything is ok.", what implementation did you test with? – André Caron Aug 02 '11 at 22:49
  • I think the C++0x FDIS doesn't say anything restrictive of that sort and only speaks about of ordering requiremets for `e < value` or `comp(e, value)`, where `e` is in the range and `value` is the findee. (I guess that's what Nicol already said.) – Kerrek SB Aug 02 '11 at 23:01
  • @André Caron: Both on g++ and MSVS. Actually I doubt there exists an implementation where this wouldn't work, but I want to find out about standard's position in this question. – Grigor Gevorgyan Aug 02 '11 at 23:04
  • @Grigor: I also doubt you'll find an implementation for which it doesn't work, but I still think it's relevant to say which implementation is referred to when something is claimed to work. – André Caron Aug 02 '11 at 23:08
  • @André Caron: well, if it's turns out to be restricted by standard, I will change it regardless of implementation. Just want to find out another truth about C++ :) – Grigor Gevorgyan Aug 02 '11 at 23:12
  • @Grigor: why the rollback on the edits? There were several typo and code fixes. – André Caron Aug 02 '11 at 23:16
  • @André Caron: Some of them were unnecessary, for example changing the `compareHumansByNames` to a struct to use a functor object - the function itself can be used as well. – Grigor Gevorgyan Aug 02 '11 at 23:20
  • Is `std::map` not an option? That seems more straightforward. – Mark B Aug 02 '11 at 23:40
  • @Mark: If that were possible, then `std::set` plus a suitable `operator<` would be even simpler. I guess the OP's question is more out of curious interest than practicality. – Kerrek SB Aug 02 '11 at 23:48
  • @Mark: It is an option, but it stores each `name` value twice - as map key and as data member.
    @Kerrek: In your case you have to search for a `Human` object in the set. If you try to construct it from the given string, and it leads to problems described in `Attempt 2`.
    – Grigor Gevorgyan Aug 03 '11 at 05:19
  • @Kerrek: Yes, the question is more about interest, but I also want to know if the code I written is legal according to standard ) – Grigor Gevorgyan Aug 03 '11 at 06:47
  • You may find this Q/A interesting: http://stackoverflow.com/questions/5838287/how-do-i-c-stl-binary-search-for-abstract-classes – Ben Jackson Aug 03 '11 at 07:23

4 Answers4

12

If your goal is to find if there is a Human with a given name, then the following should work for sure:

const std::string& get_name(const Human& h)
{
    return h.name;
}

...

bool result = std::binary_search(
    boost::make_transform_iterator(v.begin(), &get_name),
    boost::make_transform_iterator(v.end(), &get_name),
    name_to_check_against);
Alexandre C.
  • 55,948
  • 11
  • 128
  • 197
  • Well, standard compliant so long as boost is available, that is. :) Note that you might be able to replace `std::ptr_fun` and the seperate method with `std::mem_fun_ref(&Human::GetName)` or similar depending on `Human` having the accessor of course. – Billy ONeal Aug 02 '11 at 22:50
  • This is great. Does this preserve the random-access nature of the iterators so that the binary search is still fast? – Kerrek SB Aug 02 '11 at 22:51
  • @Kerrek: [yes it does](http://www.boost.org/doc/libs/1_47_0/libs/iterator/doc/transform_iterator.html#transform-iterator-models). – Alexandre C. Aug 02 '11 at 22:52
  • @Billy: yep, indeed. Also you may have a `ptr_mem` class helper to automatically manufacture such a unary functor handy. This is something I have always found missing in the standard library. – Alexandre C. Aug 02 '11 at 22:54
  • 1
    I like this solution! The problem with the `binary_search` and `lower_bound` algorithms is that you can only search the findee by value, so if the value type of the container is expensive or impossible to construct for finding purposes, then changing the iterators seems to be the only sensible solution. – Kerrek SB Aug 02 '11 at 22:55
  • My goal is to find out about the existence of the mentioned restriction on the std::binary_search. But your solution is nice ) – Grigor Gevorgyan Aug 02 '11 at 22:55
  • @Grigor: I hear you, but this was too verbose to be put as a comment to the question... – Alexandre C. Aug 02 '11 at 23:00
  • I retract my comment - you can actually use the algorithms to search for _any_ value, unrelated to the iterator's value -- in C++0x this seems to be entirely unproblematic. Nonetheless I like this very neat and symmetric solution since it emphasizes the search value and doesn't require the comparator class. – Kerrek SB Aug 02 '11 at 23:25
  • @Alexandre: Does the transform iterator also preserve constness and movability? E.g. a const_iterator would make the transform-iterator call the function with const-reference, a non-const iterator with a non-const reference, and a move-iterator with an rvalue reference? That'd be pretty cool. – Kerrek SB Aug 02 '11 at 23:49
  • @Kerrek SB: the boost people are the one behind C++0x iterator changes, so I suspect the answer is yes. – Alexandre C. Aug 03 '11 at 07:09
5

[Complete rewrite; disregard the comments]

The wording has been changed from C++03 to C++0x. In the latter, there is no more requirement for T to be less-than comparable, presumably to alleviate this unnecessary restriction.

The new standard only requires that comp(e, value) implies !comp(value, e). So as long as your comparator implements both directions, you should be able to legally search a string as the value with a comparator functor that implements both asymmetric comparisons (i.e. your "Attempt 3").

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • @Billy: Oh crap, you're right. I had this written with `find_if` at first, but that didn't have the same complexity... – Kerrek SB Aug 02 '11 at 22:39
  • This does not answer the question. The question was about the legality to use a searching value of a type different than the type of objects in the container. In your case these types are NameFinder and T. – Grigor Gevorgyan Aug 02 '11 at 22:41
  • @Grigor: Well, in my case the types are `T` and `std::string`, non? But I know that this is wrong, I'm trying to figure out a way to fix it. – Kerrek SB Aug 02 '11 at 22:43
  • 1
    not `T` and `std::string` but `T` and `NameFinder` - this is the type you've used in the binary_search as third argument. – Grigor Gevorgyan Aug 02 '11 at 22:46
  • @Kerrek: It doesn't work with `find_if` either -- `find_if` expects an equality comparator; you've supplied a less-than comparator. – Billy ONeal Aug 02 '11 at 22:48
  • @Grigor: Oh, yes, I see -- I had the `lower_bound` thing wrong. I removed it and put up a version with `find_if`. It doesn't answer the question, though, I'm still thinking about that. – Kerrek SB Aug 02 '11 at 22:48
  • @Billy: Ugh, OK, this isn't working. I'll delete this for now until I got something better. Sorry! – Kerrek SB Aug 02 '11 at 22:49
0

I think what the standard is saying here is that the expression fucntor(a, b) needs to be a valid strict weak ordering, no matter if the algorithm decides to do something like functor(*begin, *(begin + 1)). Therefore, I think your comparator would need to supply an overload of operator()(Human, Human) in order to be conforming.

That said, I think this is one of those things not explicitly allowed by the standard, but for which few or no implementations exist which take advantage of the latitude offered by the standard.

Community
  • 1
  • 1
Billy ONeal
  • 104,103
  • 58
  • 317
  • 552
0

I don't see it requiring anywhere in the standard that the types of the values passed to the comparison function (or to the < operator) by binary_search must be the same. So, formally I think it is perfectly fine to use a comparator that works with two differently types values.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • Yes, it's not said explicitly, but it may follow from the line `Requires: Type T is LessThanComparable (20.1.2).` That is what my question is about. – Grigor Gevorgyan Aug 02 '11 at 22:53
  • I think `binary_search` can do a comparison between two `Human`s directly though, in which case he would need one more overload. – Billy ONeal Aug 02 '11 at 22:54
  • The point is that the searching type (`std::string`) is different from the type of the container's elements (`Human`), and here is where the requirement about `LessThanComparable` creates problems, because there is no single type `T`, but two different types are present. – Grigor Gevorgyan Aug 02 '11 at 23:10
  • @Grigor Gevorgyan: Yes, but I don't see the reason to take it for more than what it says. It requires type `T` to be LessThanComparable? Great. Let's make sure that that requirement is met. Still, I don't see how it can prohibit mixed comparisons. – AnT stands with Russia Aug 03 '11 at 14:31
  • @AndreyT: That what was my confusion was about - if `T` has to be `LessThanComparable` that implies that objects of type `T` are supposed to be compared, doesn't it ? if so, this can be interpreted as the type of the searched object and the type of container elements are supposed to be the same. – Grigor Gevorgyan Aug 03 '11 at 14:34