I know this question has been asked a lot, but I could not find the best (the most efficient) way to remove duplicated members (type double
) from vector while maintaining 1 copy and order of the original vector.

- 1,638
- 1
- 15
- 39
-
6Please tell us what you have tried? – Kris Vandermotten Oct 22 '13 at 10:11
-
I have a vector of point (defined type) in which probably some points have been inserted twice during process. I want to delete/remove the copied points (only the second copy, I mean I want to have the first copy). and also I want to keep the order of original vector. – H'H Oct 22 '13 at 10:19
-
many thanks @Johnsyweb is it the the most efficient way? – H'H Oct 22 '13 at 10:28
-
@AliAlamiri: The OP didn't state if they had control over how and where the vector is set up. Otherwise, a set is at least semantically correct. – thokra Oct 22 '13 at 10:30
-
@H'H that is what you want. But what have you tried? Did you try anything? Or did you just look at the problem, shrug, and ask here? – Yakk - Adam Nevraumont Oct 22 '13 at 10:47
-
@Yakk for sure I tried it before asking, but since I have some constraints (e.g. keeping original vector order), still am not sure what is the best. I also need to work with the indices of the original vector members afterwords. – H'H Oct 22 '13 at 10:57
2 Answers
If your data was not double
s, simply doing a pass with an unordered_set
keeping track of what you have already seen via the remove_if
-erase
idiom would work.
double
s are, however, bad news when checking for equality: two derivations that you may think should produce the same value may generate different results. A set
would allow looking for nearby values. Simply use equal_range
with a plus minus epsilon instead of find
to see if there is another value approximetally equal to yours earlier in the vector
, and use the same remove
erase
idiom.
The remove
erase
idiom looks like:
vec.erase( std::remove_if( vec.begin(), vec.end(), [&](double x)->bool{
// return true if you want to remove the double x
}, vec.end());
in C++03 it cannot be done inline.
The lambda above will be called for each element in order, like the body of a loop.

- 262,606
- 27
- 330
- 524
If you have to/wish to use a vector* then it might be easiest to trap duplicates on insertion - if the point to be inserted is already there, bin it.
Another approach for a really large collection would be to do a sort and search for duplicates after every N insertions, where N is the perfect number of insertions to wait before doing a sort and search for duplicates. (Calculating N is left as an exercise for the reader.)
Your approach, and the value of N if relevant, depends on the number of elements, how often the array is changed, how often the contents are examined, and the likelihood of duplicates occuring.
(*apparently, vectors are great as their disadvantages lie where modern computers tend to kick butt so hard it doesn't matter, and are blisteringly fast with linear searches. At least I think that's what Bjarn's saying here comparing a vector to a linked list.)

- 1,778
- 11
- 29