0

I have following code which calculates unique values in vector

bool uniqueCompressVectorCompareFunction (unsigned int num1, unsigned int num2) {
  return (num1 == num2);
}


// redirect input from stream to file.
std::ifstream inputFile("testinput.txt");
std::streambuf* pcurrRdBuf = std::cin.rdbuf();
std::cin.set_rdbuf(inputFile.rdbuf());

unsigned int uiNoOfFishes = 0;
std::cin >> uiNoOfFishes;
std::vector<unsigned int> vecOfLenOfFishes(uiNoOfFishes);
std::vector<unsigned int> vecOfTimeHeadOfFishes(uiNoOfFishes);

// get length of fishes
for(unsigned int uiIdx = 0; uiIdx < uiNoOfFishes; uiIdx++) {
    std::cin >> vecOfLenOfFishes[uiIdx];
}

// get time head of fishes
for(unsigned int uiIdx = 0; uiIdx < uiNoOfFishes; uiIdx++) {
    std::cin >> vecOfTimeHeadOfFishes[uiIdx];
}

std::cout << "Actual input length of fishes: " << std::endl;
for(unsigned int uiIdx = 0; uiIdx <vecOfLenOfFishes.size(); uiIdx++) {
    std::cout << vecOfLenOfFishes[uiIdx] << "   ";
}
std::cout << std::endl;

std::cout << "Actual input time head of fishes: " << std::endl;
for(unsigned int uiIdx = 0; uiIdx <vecOfTimeHeadOfFishes.size(); uiIdx++) {
    std::cout << vecOfTimeHeadOfFishes[uiIdx] << "   ";
}
std::cout << std::endl;


std::vector<unsigned int> vecUniqueInputValues;
// copy length of fishes.
unsigned int uiUniqueVecIdx = 0;
for(; uiUniqueVecIdx < uiNoOfFishes; uiUniqueVecIdx++) {
    vecUniqueInputValues.push_back(vecOfLenOfFishes[uiUniqueVecIdx]);
}

// copy time head of fishes.
for(unsigned int uiIdx = 0; uiIdx < uiNoOfFishes; uiIdx++, uiUniqueVecIdx++) {
    vecUniqueInputValues.push_back( vecOfTimeHeadOfFishes[uiIdx]);
}

// using predicate comparison:
std::unique (vecUniqueInputValues.begin(), vecUniqueInputValues.end(), uniqueCompressVectorCompareFunction); 



std::cout << "compressInputData unique values sorted: " << std::endl;
for(unsigned int uiIdx = 0; uiIdx <vecUniqueInputValues.size(); uiIdx++) {
    std::cout << vecUniqueInputValues[uiIdx] << "   ";
}
std::cout << std::endl;

// contents in testinput.txt is shonw below

5
2 4 4 2 4
1 4 1 6 4

Output is shown as below

Actual input length of fishes:
2   4   4   2   4
Actual input time head of fishes:
1   4   1   6   4
compressInputData total values:
2   4   4   2   4   1   4   1   6   4
compressInputData unique values sorted:
2   4   2   4   1   4   1   6   4   4

I am expecting compress input unique values should be 1 2 4 6

What is bug in my code?

James Z
  • 12,209
  • 10
  • 24
  • 44
venkysmarty
  • 11,099
  • 25
  • 101
  • 184
  • You have to sort the vector before the std::unique: See the example: http://en.cppreference.com/w/cpp/algorithm/unique Otherwise, you could use a std::unordered_set to keep unique values – Alex Placet May 18 '18 at 09:46
  • It's probably due to your reduction of the actual code, but the function `uniqueCompressVectorCompareFunction` should never exist. Instead, just leave out the functor argument if you actually just do an equality comparison. – rubenvb May 18 '18 at 09:50
  • A similar question was answered here: https://stackoverflow.com/questions/1041620/whats-the-most-efficient-way-to-erase-duplicates-and-sort-a-vector – algoquant Jan 13 '20 at 14:57

3 Answers3

3

According to the documentation of std::unique it

[e]liminates all but the first element from every consecutive group of equivalent elements

(emphasis mine).

So it doesn't remove all recurring values, just the ones that are adjacent, reducing for example [1 2 2 3 3 3 1 1 3 3 2 2] to [1 2 3 1 3 2].

If you only want to keep the overall unique elements, you should sort the vector first:

std::sort(vecUniqueInputValues.begin(), vecUniqueInputValues.end());

// Vector is now [1, 1, 2, 2, .., 2, 4, .., 4, 5, ... ]

std::unique (vecUniqueInputValues.begin(), vecUniqueInputValues.end(), 
  uniqueCompressVectorCompareFunction); 

Some minor points

The comparison function seems a bit redundant -- if you don't specify it

[e]lements are compared using operator==.

Also note that std::unique really only shuffles the elements around so that all the unique consecutive elements are at the start of the vector - you still see the value 4 twice at the end of your output. You should grab the result of std::unique - it is an iterator pointing one past the end of the unique range:

auto endOfUniqueRange = std::unique(...);

std::cout << "compressInputData unique values sorted: " << std::endl;
for(auto& uiIt = vecUniqueInputValues.cbegin(); uiIt != endOfUniqueRange ; ++uiIt) {
    std::cout << *uiIt << "   ";
}
std::cout << std::endl;

Actually I recommend that you use iterators and range-based for loops anyway: instead of

for(auto i = 0; i < myVec.size(); ++i) {
  // Do something with myVec[i]
}

write

for(auto it = myVec.cbegin(); it != myVec.cend(); ++it) {
  // Do something with *it
}

or, to iterate the whole vector,

for(unsigned int val : myVec) { 
  // Do something with val
}
CompuChip
  • 9,143
  • 4
  • 24
  • 48
3

From the docs:

std::unique Eliminates all but the first element from every consecutive group of equivalent elements from the range [first, last) and returns a past-the-end iterator for the new logical end of the range.

see the word consecutive, this means that it would only remove the duplicates if the vector is sorted, and the repetitions are in consecutive order.

Sort the vector before calling std::unique

rubenvb
  • 74,642
  • 33
  • 187
  • 332
Tomaz Canabrava
  • 2,320
  • 15
  • 20
0

As the other answers point out, std::unique removes consecutive duplicates. This allows it to be a single pass algorithm.

If you want to remove non-consecutive duplicates without changing the order of remaining elements, you'll have to write a different function to do it. This will generally require multiple passes over the input range.

Here is a simple implementation, to match the signature of std::unique

template <typename ForwardIterator>
ForwardIterator multipass_unique(ForwardIterator first, ForwardIterator last)
{
    for (; first != last; ++first)
    {
        // Search the whole range for values that are equal to the current
        // Any values removed shorten the range to search
        last = std::remove(std::next(first), last, *first);
    }
    return last;
}

And a variant that accepts a BinaryPredicate

template <typename ForwardIterator, typename BinaryPredicate>
ForwardIterator multipass_unique(ForwardIterator first, ForwardIterator last, BinaryPredicate pred)
{
    for (; first != last; ++first)
    {
        // Search the whole range for values that satisfy the predicate
        // Any values removed shorten the range to search
        last = std::remove_if(std::next(first), last, [=](auto val){ return pred(*first, val); });
    }
    return last;
}
Caleth
  • 52,200
  • 2
  • 44
  • 75
  • You can still do it in a single pass. For example, you could add all the elements you've seen to a hash set. While scanning the vector, you either encounter an element already in the set and you remove it, or you encounter an element that you haven't seen: leave it and add it to the set. Not O(1), but single pass. – CompuChip May 18 '18 at 22:48
  • @CompuChip that requires copyable values and additional allocation – Caleth May 18 '18 at 22:51