1

I have K objects (K is small, e.g. 2 or 5) and I need to iterate over them N times in random order where N may be large. I need to iterate in a foreach loop and for this I should provide an iterator.

So far I created a std::vector of my K objects copied accordingly, so the size of vector is N and now I use begin() and end() provided by that vector. I use std::shuffle() to randomize the vector and this takes up to 20% of running time. I think it would be better (and more elegant, anyways) to write a custom iterator that returns one of my object in random order without creating the helping vector of size N. But how to do this?

tnorgd
  • 1,580
  • 2
  • 14
  • 24

2 Answers2

3

It is obvious that your iterator must:

  1. Store pointer to original vector or array: m_pSource
  2. Store the count of requests (to be able to stop): m_nOutputCount
  3. Use random number generator (see random): m_generator
  4. Some iterator must be treated as end iterator: m_nOutputCount == 0

I've made an example for type int:

#include <iostream>
#include <random>

class RandomIterator: public std::iterator<std::forward_iterator_tag, int>
{
public:
    //Creates "end" iterator
    RandomIterator() : m_pSource(nullptr), m_nOutputCount(0), m_nCurValue(0) {}

    //Creates random "start" iterator
    RandomIterator(const std::vector<int> &source, int nOutputCount) :
        m_pSource(&source), m_nOutputCount(nOutputCount + 1), 
        m_distribution(0, source.size() - 1)
    {
        operator++(); //make new random value
    }

    int operator* () const
    {
        return m_nCurValue;
    }

    RandomIterator operator++()
    {
        if (m_nOutputCount == 0)
            return *this;
        --m_nOutputCount;

        static std::default_random_engine generator;
        static bool bWasGeneratorInitialized = false;
        if (!bWasGeneratorInitialized)
        {
            std::random_device rd; //expensive calls
            generator.seed(rd());
            bWasGeneratorInitialized = true;
        }       

        m_nCurValue = m_pSource->at(m_distribution(generator));
        return *this;
    }

    RandomIterator operator++(int)
    {   //postincrement
        RandomIterator tmp = *this;
        ++*this;
        return tmp;
    }

    int operator== (const RandomIterator& other) const
    {
        if (other.m_nOutputCount == 0)
            return m_nOutputCount == 0; //"end" iterator
        return m_pSource == other.m_pSource;
    }

    int operator!= (const RandomIterator& other) const
    {
        return !(*this == other);
    }
private:
    const std::vector<int> *m_pSource; 
    int m_nOutputCount;
    int m_nCurValue;

    std::uniform_int_distribution<std::vector<int>::size_type> m_distribution;
};

int main()
{
    std::vector<int> arrTest{ 1, 2, 3, 4, 5 };

    std::cout << "Original =";
    for (auto it = arrTest.cbegin(); it != arrTest.cend(); ++it)
        std::cout << " " << *it;
    std::cout << std::endl;

    RandomIterator rndEnd;

    std::cout << "Random =";
    for (RandomIterator it(arrTest, 15); it != rndEnd; ++it)
        std::cout << " " << *it;

    std::cout << std::endl;
}

The output is:

Original = 1 2 3 4 5
Random = 1 4 1 3 2 4 5 4 2 3 4 3 1 3 4

You can easily convert it into a template. And make it to accept any random access iterator.

Dmitriy Zapevalov
  • 1,357
  • 8
  • 13
  • If you want random output each programm run. Don't forget to initialize `m_generator` with some seed (for example using `std::random_device`)! Also `m_generator` must be static. – Dmitriy Zapevalov Aug 07 '16 at 13:18
1

I just want to increment Dmitriy answer, because reading your question, it seems that you want that every time that you iterate your newly-created-and-shuffled collection the items should not repeat and Dmitryi´s answer does have repetition. So both iterators are useful.

template <typename T>
struct  RandomIterator : public std::iterator<std::forward_iterator_tag, typename T::value_type>
{
    RandomIterator() : Data(nullptr)
    {
    }

    template <typename G>
    RandomIterator(const T &source, G& g) : Data(&source)
    {
        Order = std::vector<int>(source.size());
        std::iota(begin(Order), end(Order), 0);
        std::shuffle(begin(Order), end(Order), g);
        OrderIterator = begin(Order);
        OrderIteratorEnd = end(Order);
    }

    const typename T::value_type& operator* () const noexcept
    {
        return (*Data)[*OrderIterator];
    }

    RandomIterator<T>& operator++() noexcept
    {
        ++OrderIterator;
        return *this;
    }

    int operator== (const RandomIterator<T>& other) const noexcept
    {
        if (Data == nullptr && other.Data == nullptr)
        {
            return 1;
        }
        else if ((OrderIterator == OrderIteratorEnd) && (other.Data == nullptr))
        {
            return 1;
        }
        return 0;
    }

    int operator!= (const RandomIterator<T>& other) const noexcept
    {
        return !(*this == other);
    }
private:
    const T *Data;
    std::vector<int> Order;
    std::vector<int>::iterator OrderIterator;
    std::vector<int>::iterator OrderIteratorEnd;
};

template <typename T, typename G>
RandomIterator<T> random_begin(const T& v, G& g) noexcept
{
    return RandomIterator<T>(v, g);
}

template <typename T>
RandomIterator<T> random_end(const T& v) noexcept
{
    return RandomIterator<T>();
}

whole code at http://coliru.stacked-crooked.com/a/df6ce482bbcbafcf or https://github.com/xunilrj/sandbox/blob/master/sources/random_iterator/source/random_iterator.cpp

Implementing custom iterators can be very tricky so I tried to follow some tutorials, but please let me know if something have passed:

http://web.stanford.edu/class/cs107l/handouts/04-Custom-Iterators.pdf https://codereview.stackexchange.com/questions/74609/custom-iterator-for-a-linked-list-class Operator overloading

I think that the performance is satisfactory: On the Coliru:

<size>:<time for 10 iterations>
1:0.000126582
10:3.5179e-05
100:0.000185914
1000:0.00160409
10000:0.0161338
100000:0.180089
1000000:2.28161

Off course it has the price to allocate a whole vector with the orders, that is the same size of the original vector. An improvement would be to pre-allocate the Order vector if for some reason you have to random iterate very often and allow the iterator to use this pre-allocated vector, or some form of reset() in the iterator.

Community
  • 1
  • 1