4

I wrote a custom container with its custom iterator. Due to the specific features of the container the iterator must be evaluated lazily. For the sake of the question the relevant part of the code is the dereferencing operator of the iterator which is implemented this way

template<typename T>
struct Container
{
  vector<T> m_Inner;

  // This should calculate the appropriate value.
  // In this example is taken from a vec but in 
  //the real use-case is calculated on request
  T Value(int N)
  { m_Inner.at(N); }
}

template<typename T>
struct Lazy_Iterator
{
  mutable pair<int, T> m_Current;
  int Index
  Container<T>* C

  Lazy_Iterator(const Container& Cont, int N):
    m_Current{Index, T{}}, Index{N}, C{&Cont}
  {      }

  pair<int, T>&
  operator*() const // __attribute__((noinline)) (this cures the symptom)
  {
      m_Current.first = Index; /// Optimized out
      m_Current.second = C->Value(Index); /// Optimized out
      return m_Current;
  }

}

Because the iterator itself is a template, its functions can be freely inlined by the compiler.

When I compile the code without optimizations the returned value is updated as expected. When I use the release compiler optimization (-O2 in GCC 4.9) in some cases the compiler optimizes out the lines that I marked as Optimized out even though the m_Current member is marked as mutable. As a consequence the return value does not match the value that the iterator should be pointing at.

Is this an expected behavior? Do you know any portable way to specify that the content of that function should be evaluated even if it's marked const?

I hope the question is exhaustive enough to be useful. Please advice if more details could be helpful in this case.

Edit:

To answer one comment, this is a potential usage taken from a small test program:

Container<double> myC;
Lazy_Iterator<double> It{myC, 0}
cout << "Creation: " << it->first << " , " << it->second << endl;

auto it2 = it;
cout << "Copy: "<<  it2->first << " , " << it2->second << endl;

cout << "Pre-increment: " << (it++)->first << " , " << it->second << endl;
cout << "Post-increment: " << (++it)->first << " , " << it->second << endl;
cout << "Pre-decrement: " << (it--)->first << " , " << it->second << endl;
cout << "Post-decrement: " << (--it)->first << " , " << it->second << endl;
cout << "Iterator addition: " << (it+2)->first << " , " << (it+2)->second << endl;
cout << "Iterator subtraction: "<< (it-2)->first << " , " << (it-2)->second << endl;

reverse_iterator<Lazy_Iterator> rit{it};
cout << "Reverse Iterator: " << rit->first << " , " << rit->second << endl;

auto rit2 = rit;
cout << "Reverse Iterator copy: " << rit2->first << " , " << rit2->second << endl;

cout << "Rev Pre-increment: " << (rit++)->first << " , " << rit->second << endl;
cout << "Rev Post-increment: " << (++rit)->first << " , " << rit->second << endl;
cout << "Rev Pre-decrement: " << (rit--)->first << " , " << rit->second << endl;
cout << "Rev Post-decrement: " << (--rit)->first << " , " << rit->second << endl;
cout << "Rev Iterator addition: " << (rit+2)->first << " , " << (rit+2)->second << endl;
cout << "Rev Iterator subtraction: "<< (rit-2)->first << " , " << (rit-2)->second << endl;

The test results are as expected for all tests except the last two lines

The last two lines of the test break down when the optimization is turned on.

The system actually work well and is not so more dangerous than any other iterator. Of course it will fail if the container gets deleted under his nose and it's probably safer to use the returned value by copy and not just keep the reference around, but this is off-topic

Triskeldeian
  • 590
  • 3
  • 18

4 Answers4

2

"Optimized out even though the m_Current member is marked as mutable"

This tells me that you're assuming the optimizer cares about mutable. It doesn't. const and mutable have been stripped by an earlier phase of compilation.

Why then does the optimzier remove the two statements, if they're inlined ? I suspect that after inlining, the optimizer can prove that the two writes are a no-op, either as the m_Current variable must hold the right value already, or because the subsequent usage of m_Current makes it moot. Trivially the following case make those writes a no-op:

Lazy_Iterator LI = foo(); // Theoretically writes
*LI = bar(); // Overwrites the previous value.
MSalters
  • 173,980
  • 10
  • 155
  • 350
  • If during optimization the compiler ignores the fact that the value stored in the class could be changed because in the meantime there were only const member function that were called, then it could make sense, but that would assume that the optimizer doesn't know of the constness of a data member but knows about the constness of a method. Also I tried removing the const to the operators and the mutable from the member but there is no change – Triskeldeian Mar 03 '16 at 12:46
  • @Triskeldeian: Again, an optimizer doesn't even have that `const member` information anymore, nor would it help. Your `T const* this` may be unable to modify member variables, but a global `T* singleton` may exist which aliases that very same object in a non-const way. Or there could be a global `int&` that aliases `this`. The optimizer has to **prove** there's no possible concurrent access. – MSalters Mar 03 '16 at 14:32
  • That's what I actually meant in my comment. Anyway as it happens the problem was not the optimization itself but the fact that the GCC reverse_iterators are incompatible with a lazy iterator implementation. Apparently the optimization just rearranged the stack memory so that the error became apparent – Triskeldeian Mar 03 '16 at 14:46
2

There is a problem with difference between physical iterator held by reverse_iterator (what is returned by .base()) and logical value it points to: they are off-by-one. reverse_iterator might do return *(--internal_iterator); on dereference, which leaves you with dangling reference to internals of destroyed function-local temporary.

After another reading of standard, I found out that it has additional requirements to avoid such scenario, read note.

Also I found that GCC 4.9 standard library is non-compliant. It uses a temporary. So, I think it is a GCC bug.

Edit: Standard quote

24.5.1.3.4 operator*     [reverse.iter.op.star]

reference operator*() const;

1 Effects:

deref_tmp = current;  
--deref_tmp; 
return *deref_tmp;

2 [ Note: This operation must use an auxiliary member variable rather than a temporary variable to avoid returning a reference that persists beyond the lifetime of its associated iterator. (See 24.2.) —end note ]

Follow-up reading: Library Defect Report 198.

And it seems that it is returned to old behaviour.

Late edit: P0031 was voted in C++17 Working Draft. It states that reverse_iterator uses temporary, not member to hold intermediate value.

Revolver_Ocelot
  • 8,609
  • 3
  • 30
  • 48
  • This looks very interesting. But wouldn't a dangling reference cause a an access violation? And why this only appears when the optimization is switched on? – Triskeldeian Mar 03 '16 at 12:47
  • @Triskeldeian It is UB, but it usually does not lead to crash. It still points to some memory owned by program, but who knows what is written in this memory. As local variables are placed on stack which is used by virtually anything, memory is overwritten as soon as function exits – Revolver_Ocelot Mar 03 '16 at 12:51
  • Thank you very much. I think you hit the bullseye. Until know I was always testing with T being some numerical type. I switched to std::string to give it a try and now, even with the optimization off the return value of the dereferencing of the reverse iterator is gibberish. Does that mean that I should write my own reverse iterator or do you know some good technique to solve this problem? – Triskeldeian Mar 03 '16 at 12:57
  • 1
    @Triskeldeian I have updated my post. Looks like GCC bug. – Revolver_Ocelot Mar 03 '16 at 13:02
  • In fact I spotted the same thing just when you were updating your post and I posted it as an answer to my question – Triskeldeian Mar 03 '16 at 13:08
1

Provided you have to post a compilable snippet that reproduces that problem (actually I was not able to reproduce it with GCC 4.9) I think you have undefined behaviour and that is triggered by O2 (O2 enables optimizations that can break undefined behaviours). You should have a pointer to

Container<T> 

inside the iterator.

Anyway be aware that a lazy iterator breaks the contract of std's iterators, I think a better alternative is to make a regular container of lazy values, you could this way skip to create a custom container and iterator altogether ;) (look to proxy pattern).

CoffeDeveloper
  • 7,961
  • 3
  • 35
  • 69
  • I did forget the template parameter. I corrected it. I also added more line in the test snippet. The tests that fail if the optimization is on are the last two, the arithmetic on the reverse iterator. In that case a copy of the forward iterator which is make inside of the reverse one should be dereferenced and during that dereferencing the update of the content of the iterator seems to be deleted – Triskeldeian Mar 03 '16 at 12:37
0

After a very profitable round of discussions the answer of Revolver_Ocelot pointed me to look further to the implementation of the reverse_iterators. According to his quote from the standards:

24.5.1.3.4 operator* [reverse.iter.op.star]

reference operator*() const;

1 Effects:

deref_tmp = current;  
--deref_tmp;  
return *deref_tmp;

2 [ Note: This operation must use an auxiliary member variable rather than a temporary variable to avoid returning a reference that persists beyond the lifetime of its associated iterator. (See 24.2.) —end note ]

Looking inside of the header stl_iterator.c of the standard library as implemented by GCC 4.9 in Debian 8:

  /**
   *  @return  A reference to the value at @c --current
   *
   *  This requires that @c --current is dereferenceable.
   *
   *  @warning This implementation requires that for an iterator of the
   *           underlying iterator type, @c x, a reference obtained by
   *           @c *x remains valid after @c x has been modified or
   *           destroyed. This is a bug: http://gcc.gnu.org/PR51823
  */
  reference
  operator*() const
  {
_Iterator __tmp = current;
return *--__tmp;
  }

Notice the warning:

Warning: This implementation requires that for an iterator of the underlying iterator type, @c x, a reference obtained by @c *x remains valid after @c x has been modified or destroyed. This is a bug: http://gcc.gnu.org/PR51823

Triskeldeian
  • 590
  • 3
  • 18