11

With C++11, the STL has now a std::iota function (see a reference). In contrast to std::fill_n, std::generate_n, there is no std::iota_n, however. What would be a good implementation for that? A direct loop (alternative 1) or delegation to std::generate_n with a simple lambda expression (alternative 2)?

Alternative 1)

template<class OutputIterator, class Size, class T>
OutputIterator iota_n(OutputIterator first, Size n, T value)
{
        while (n--)
                *first++ = value++;
        return first;
}

Alternative 2)

template<class OutputIterator, class Size, class T>
OutputIterator iota_n(OutputIterator first, Size n, T value)
{
        return std::generate_n(first, n, [&](){ return value++; });
}    

Would both alternatives generate equivalent code with optimizing compilers?

UPDATE: incorporated the excellent point of @Marc Mutz to also return the iterator at its destination point. This is also how std::generate_n got updated in C++11 compared to C++98.

TemplateRex
  • 69,038
  • 19
  • 164
  • 304
  • 2
    I think this question focuses on something too specific for something more general: the different loop constructs. –  Aug 01 '12 at 21:04
  • 2
    Why don't you try it and compare the assembler? – Kerrek SB Aug 01 '12 at 21:04
  • 1
    @KerrekSB Not that much of an expert on grokking assembly output. I'm interested in hearing from people with such expertise if STL oneliners with lambdas will normally be optimized to straight loops. If it is the case, this would be a bigger incentive to write more variations on STL algorithms, rather than thinking hard about intricate loops. – TemplateRex Aug 01 '12 at 21:07
  • 1
    If you wanted to work with random access iterators only, you could simply do `std::iota(start, start + n, value);`. Also, I would change `i != n` to `i < n` for the first alternative. – Jesse Good Aug 01 '12 at 21:18

4 Answers4

10

As a random example, I compiled the following code with g++ -S -O2 -masm=intel (GCC 4.7.1, x86_32):

void fill_it_up(int n, int * p, int val)
{
    asm volatile("DEBUG1");
    iota_n(p, n, val);
    asm volatile("DEBUG2");
    iota_m(p, n, val);
    asm volatile("DEBUG3");
    for (int i = 0; i != n; ++i) { *p++ = val++; }
    asm volatile("DEBUG4");
}

Here iota_n is the first version and iota_m the second. The assembly is in all three cases this:

    test    edi, edi
    jle .L4
    mov edx, eax
    neg edx
    lea ebx, [esi+edx*4]
    mov edx, eax
    lea ebp, [edi+eax]
    .p2align 4,,7
    .p2align 3
.L9:
    lea ecx, [edx+1]
    cmp ecx, ebp
    mov DWORD PTR [ebx-4+ecx*4], edx
    mov edx, ecx
    jne .L9

With -O3, the three versions are also very similar, but a lot longer (using conditional moves and punpcklqdq and such like).

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • 1
    Thanks, that's a great answer. Regardless of what `punpcklqdq` does, (I checked on MSDN), it's reassuring to know that there is hardly any abstraction penalty from calling `std::generate_n` + a lambda. – TemplateRex Aug 01 '12 at 21:37
3

You're so focused on code generation that you forgot to get the interface right.

You correctly require OutputIterator, but what happens if you want to call it a second time?

list<double> list(2 * N);
iota_n(list.begin(), N, 0);
// umm...
iota_n(list.begin() + N, N, 0); // doesn't compile!
iota_n(list.rbegin(), N, 0); // works, but create 0..N,N-1..0, not 0..N,0..N
auto it = list.begin();
std::advance(it, N);
iota_n(it, N, 0); // works, but ... yuck and ... slow (O(N))

inside iota_n, you still know where you are, but you've thrown that information away, so the caller cannot get at it in constant time.

General principle: don't throw away useful information.

template <typename OutputIterator, typename SizeType, typename ValueType>
auto iota_n(OutputIterator dest, SizeType N, ValueType value) {
    while (N) {
        *dest = value;
        ++dest;
        ++value;
        --N;
    }
    // now, what do we know that the caller might not know?
    // N? No, it's zero.
    // value? Maybe, but it's just his value + his N
    // dest? Definitely. Caller cannot easily compute his dest + his N (O(N))
    //       So, return it:
    return dest;
}

With this definition, the above example becomes simply:

list<double> list(2 * N);
auto it = iota_n(list.begin(), N, 0);
auto end = iota_n(it, N, 0);
assert(end == list.end());
Marc Mutz - mmutz
  • 24,485
  • 12
  • 80
  • 90
  • 1
    Nice point, I agree that both the existing `iota` and the putative `iota_n` should return the destination. – TemplateRex Mar 09 '15 at 17:45
  • 1
    @TemplateRex: `iota` doesn't return the destination because it's equal to its second argument. But `iota_n` is different, the end point is implicit, and thus `iota_n` does need to return the destination. – Marc Mutz - mmutz Mar 10 '15 at 16:29
  • 1
    Ah thanks, now I see. I updated the answer with your info. Note that `std::generate_n` also got this upgrade in C++11. – TemplateRex Mar 10 '15 at 19:20
  • 1
    There's one thing I don't like here, that `N`, which is by convention a macro name, is used as a variable. The rest is very good. In particular, there's no point in obfuscating code using postfix increment magic. Spelling out the assignment and increment operations doesn't hurt. – Ulrich Eckhardt Mar 10 '15 at 19:33
1

A hypothetical iota_n

std::iota_n(first, count, value)

can be replaced by a one liner.

std::generate_n(first, count, [v=value]()mutable{return v++;})

I prefer this to have a lingering function that is not in the standard. Having said that, I think a std::iota_n should be in the standard.

alfC
  • 14,261
  • 4
  • 67
  • 118
0

Probably use back_insertor along with std::generate_n so that pre-allociton of collections can be avoided.

vector<int> v3;
generate_n(back_inserter(v3),10,[i=1]() mutable{
    return i++;
});

We can swap this for iota_n till we get iota_n :)