4

I couldn't find a better title for this issue, please update it if needed as per the below question.

Consider an iterator - iter, pointing to elements in a std::vector<int>. I'm inserting the value of *iter at the current iterator position using:

iter = target.insert(iter, *iter);

I understand that insert will return the iterator to the newly inserted element. Now I changed the above statement to:

iter = target.insert(iter, *iter++);

I knew that was going to behave awkwardly, and it resulted in Segmentation fault. What I can't understand is, how the above assignment is evaluated? I mean, to which element will iter point to after that assignment. As far as I understand, since I've used a post-increment operator, it should behave in similar way as the first statement, which I am inferring from the below sample assignment:

i = i++;

Since the above assignment doesn't affect the value of i, same should be true in case of *iter++.

What's the actual behaviour behind the scene?


This is the real code:

std::vector<int>::iterator begin = target.begin();

while (begin != target.end()) {
    if (*begin % 2 == 0) {
        begin = target.erase(begin);
    } else {
        begin = target.insert(begin, *begin); // I changed this line
        begin += 2;
    }
}

Basically the above code is removing the even element from vector, and duplicating the odd element.

Edit:

Ok, if I change the second line to ++begin, in addition to the previous change - it works in the same way as current code. So, I replaced two lines in the else block, with these lines:

begin = target.insert(begin, *begin++);
++begin;

So, it seems like it is assigning the iterator one past than that in the previous case.

So, this is what I understood from this behaviour:

  • *begin++ first de-references the begin to get the value at current location.
  • begin is post-incremented
  • The insert now inserts the de-referenced value before the iterator after the post-increment. So, it's not inserting before the original begin, rather, before the ++begin. And then returns the value ++begin, where it inserted.

Have I interpreted it correctly?

Rohit Jain
  • 209,639
  • 45
  • 409
  • 525
  • @n.m. Only if `i` is a pointer. – R. Martinho Fernandes Jul 10 '13 at 15:42
  • Calc is done from left to right? Then inserts (iter,*iter) then increments iter so you are not assigning to original iter. Why dont you look at disassembly of it ? – huseyin tugrul buyukisik Jul 10 '13 at 15:42
  • @hmjd. And what will be final value of `iter`? – Rohit Jain Jul 10 '13 at 15:44
  • @R.MartinhoFernandes. Umm. Actually that example is not considering `i` as a pointer. I've just given an example considering `i` to be an `int`. – Rohit Jain Jul 10 '13 at 15:45
  • Ok, let me add the real code, so that you can help better. – Rohit Jain Jul 10 '13 at 15:47
  • I've added the code. Also, the change which I did in the code to see the original behaviour. – Rohit Jain Jul 10 '13 at 15:55
  • I'm not sure I understand your question. The code you've posted contains `insert(begin, *begin);`, not `insert(begin, *begin++);` With the latter, the insertion could happen at `begin` or `begin + 1` because the compiler is free to evaluate function call arguments in any order. Also, `i = i++;` is [undefined behavior](http://stackoverflow.com/questions/4968854/is-i-i-truly-a-undefined-behavior). – Praetorian Jul 10 '13 at 15:56
  • @Praetorian. Yeah the original code did contain that line. I was understanding the behaviour by changing it to later. Ah! about `i = i++`, I didn't knew it is UB in C++. It does work in Java that way. – Rohit Jain Jul 10 '13 at 15:58
  • The precise title is what brought me to this question. Keep it. :-) – Martin Hennings Mar 02 '18 at 10:48

3 Answers3

3

I couldn't quite understand your question, but to respond to the understanding you listed: Your understanding may be correct on a particular compiler/day/build. It is unspecified whether you will insert at begin or begin + 1 (which then results in the value returned from insert being one of two different possible values) because the compiler may evaluate the arguments to insert in any order it likes. Instead of trying to work out exactly what will happen, split the code into appropriate pieces the make it perfectly clear what you're trying to do and let the optimizer take care of things for you. Almost certainly the reason it core dumped is because when you increment by two you skip over end and never detect that you passed the end of the container.

I feel I should also point out that "luckily" there's a sequence point after the evaluation of function arguments, or begin = target.insert(begin, *begin++); would be undefined behavior with two writes to begin and no intervening sequence point (for example i = i++ is UB).

Mark B
  • 95,107
  • 10
  • 109
  • 188
  • Ok, all that you wrote I understood. As for UB, will the UB behave the same way for a compiler it is behaving now? Or it might behave differently for the same compiler on two different days? – Rohit Jain Jul 10 '13 at 16:12
  • UB can change at any time, even without rebuilding your application (although a compiler that behaved in such a way might be somewhat unusual). – Mark B Jul 10 '13 at 16:18
  • UB stands for Undefined Behaviour. The word Undefined is there for a reason. If they wanted us to be able to tell what it actually is, they would have used a different word. – n. m. could be an AI Jul 10 '13 at 16:27
  • @MarkB I'm not sure that you made it clear about the sequence point issue. The function which ensures that there is no UB (just unspecified results) is the `operator++` function of the iterator. If the iterator is a typedef to a pointer (as it was in some early implementations), then there is undefined behavior. – James Kanze Jul 10 '13 at 16:48
1

AFAIK you construct target.insert(iter, *iter++) is not legal c++, as it leads to 'undefined behavior'. *iter++ is legal, and is evaluated left-to-right, but order in which agruments of a function call are evaluated is not specified. Writing such an expression is a programming error.

to be precise, your code

target.insert(iter, *iter++)

given aliases

std::vector<int>::iterator cur = iter;
std::vector<int>::iterator next = iter + 1;

can be evaluated in these 2 ways

targets.insert(next, *cur);
++iter;

or

targets.insert(cur, *cur);
++iter;

the exact version is a subject to compiler, os, other code etc and is generally not specified.

Peter K
  • 1,787
  • 13
  • 15
  • 1
    Unlike `i = i++` this isn't undefined behavior. It is however unspecified. – Mark B Jul 10 '13 at 16:06
  • @MarkB: Actually, I believe it is really **undefined behavior**. There is no sequence point between the evaluations of arguments to a function call, and the compiler can actually interleave the evaluations however it sees fit (provided the ordering constraints within each argument are met). The `insert` call is performing a read on `iter` (first argument) and a modify on `iter` (second argument) at the same time, which I read as undefined behavior. (A read and a modify on the same object without a sequence point is only allowed if the read is used to determine the value for the modify.) – Adam H. Peterson Jul 10 '13 at 16:43
  • @AdamH.Peterson You're forgetting the sequence points due to the call of `std::vector<>::iterator::operator++`. – James Kanze Jul 10 '13 at 16:49
  • 1
    @JamesKanze, yes, I think you may be right. However, it would be UB if `iter` were simply a pointer into an array, yes? Is it permissible for a standard library to provide `vector::iterator` as a typedef to `T*`? If so, I think it would be UB for `vector::iterator` as well, since the standard wouldn't guarantee a sequence point (even though some/most implementations would provide one). – Adam H. Peterson Jul 10 '13 at 16:55
  • @AdamH.Peterson If `iter` is a pointer, it's undefined behavior, because you don't have the sequence points from the function call (to `operator++`). And `T*` meets the standard's requirements for a random-access iterator, provided the container maintains its elements in contiguous memory; in fact, most of the early implementations of `std::vector` did use `T*`. (As you say, the standard doesn't require sequence points around `iter ++`.) – James Kanze Jul 10 '13 at 17:04
  • @JamesKanze, do you know if a standard-conforming implementation of `vector::iterator` may legally be a naked pointer? I always assumed it could be, and that some implementations would actually do that (although I never actually looked it up in the standard). **EDIT** It looks like you answered my question in your comment; perhaps you edited your comment after I started writing my reply. – Adam H. Peterson Jul 10 '13 at 17:11
0

Although this is a very old question, i want to share my view and welcome to discuss.

target.insert(iter, *iter++)

this code will have some effects

  • *iter++: get the value *iter and will invoke iter++ in the future
  • insert *iter to vector.

But insert value to vector may have different effects depends on size is equal to capacity or not

if size == capacity, vector need malloc new memory and copy to new memory, so the iterator need update

if size < capacity, vector insert item directly.

so back and focus to the code iter = target.insert(iter, *iter++);

if insert cause new memory, so using iter++ will access to invalid address and result in Segmentation fault.

if you write code like this: iter = target.insert(iter, *iter);, iter will always update,so it is correct

So if you are definitely sure size always less than capacity after insert, you can use this code but not suggestion

Noah Gao
  • 51
  • 3