13

I am working with data that shouldn't pop up twice. If it does, it should detect it and invoke a function handling that.

Currently, I am pushing some data to a vector and before insertion, it should check if the data is already contained in that vector. At the moment, this is not very effective, e.g

for (int i = 0; i < myVector.size() ; i++) 
{
  if ( myVector[i] == data ) 
  {
             // invoke function
             return false;
  }
}

I know set is a special kind of vector which allows only unique data.

Is there another way to detect duplicate data being added (or at least attempting to add it) to the set?

zx485
  • 28,498
  • 28
  • 50
  • 59
Darlyn
  • 4,715
  • 12
  • 40
  • 90
  • Is there any reason that you are using a vector? – Ed Heal Mar 21 '16 at 15:51
  • I am returning vector from function , which is more optimalized than returning an array from it ( thats what i was told here ). In other part of the code , i use vector of structs , but the idea is the same – Darlyn Mar 21 '16 at 15:53
  • 3
    Your question is not clear, you asked for duplicates using a vector or a set ? – Jean-Baptiste Yunès Mar 21 '16 at 15:54
  • 1
    I gave an example what i am using now which is ineffective , and asked about possible easier and better solution using set – Darlyn Mar 21 '16 at 15:56
  • When you *insert* into a set the returned value indicates if it was a duplicate or not: http://en.cppreference.com/w/cpp/container/set/insert – Galik Mar 21 '16 at 15:58
  • @trolkura : Your question that relates to the very fundamental concept of hashing. I suggest you first read up about it as it seems you are unaware of it. – therainmaker Mar 21 '16 at 15:58
  • @therainmaker -- while knowing about fundamentals such as hashing is always great -- then using `sets` and other structure does _typically_ not require the users to know about such things -- they are hidden inside the implementation – Soren Mar 21 '16 at 16:03
  • @Soren : While I agree with you on this, I merely meant it as a suggestion which I felt would be useful to the OP. Hashing is a very basic concept which I feel every programmer should know. Note that I'm not talking about the underlying implementation of a `set` or any other container. – therainmaker Mar 21 '16 at 16:06

4 Answers4

30

First let's just be clear that a set is not a special kind of vector. It's a kind of container, orthogonal to vector, that happens to prevent duplicates.

You can detect duplicates by checking the return value from insert:

if(my_set.insert("value").second == false) { do_something_for_duplicate(); }
Mark B
  • 95,107
  • 10
  • 109
  • 188
7

std::set returns std::pair<iterator, bool>, where the bool is false when the insertion failed (by adding duplicate values for example).

Example:

std::set<int> set{ 1, 2, 3 };
auto result = set.insert(1);
if (!result.second)
    std::cout << "Failed to insert element!" << std::endl;
Rakete1111
  • 47,013
  • 16
  • 123
  • 162
6

A std::set or std::unordered_set is another container from standard c++ library but is not a vector... They obey different rules:

  • a vector is more or less a growable array: no control on duplicates, but respect insertion order
  • a set requires an order on the data it contains and allows to browse its data according to that order. It automatically reject duplicates at insertion time
  • an unordered_set also reject duplicates at insertion time, but the browse order is kind of random (not exactly, it is even perfectly deterministic but depends on the hash function being used)

For a vector, a simple way to see if it already contains a value is (ref):

std::find(vector.begin(), vector.end(), item) != vector.end()

For a set of unordered_set, the insert method returns a pair iterator pointing to the element - boolean indicating where it was added or not for being already there

if (! my_set.insert(data).second) {
    // invoke function
    return false;
}
Community
  • 1
  • 1
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
1

You can use an std::unordered_set. There is a method insert which, depending on library version returns you information about the insertion (either a pair with a bool that is true if insertion was effective and false if already in), or an iterator, etc. Find the documentation of your lib.

Jean-Baptiste Yunès
  • 34,548
  • 4
  • 48
  • 69