1

Some (relevant?) Background: I have a class that provides iterator-like functionality in c++. It is used to decompress a large dataset. Behind the scenes, I have a CompressedData object. The iterator class iterates over the compressed data and returns uncompressed Data objects

Data ExpandingIterator::operator*() const;

The dereference operator returns a Data and not a Data& because there is no storage anywhere for an uncompressed Data object to persist. For example, the compressed data could contain the following three entries:

  • 5 '3's,
  • 3 '1's,
  • 6 '0s'

Then when you iterated over this collection with the ExpandingIterator, you would get:

3, 3, 3, 3, 3, 1, 1, 1, 0, 0, 0, 0, 0, 0

The 3s and 1s and 0s never all exist somewhere in an array at the same time. They're expanded one-at-a-time from the compressed data. Now, the real data is more complicated than simple ints. There's data, timestamps, and other methods to turn the data into other forms of data.

The Question: I have successfully implemented the dereference operator mentioned above. The problem is that when I use the ExpandingIterator and want to access one of the methods on the Data object, I end up writing

(*iterator).Method()

I would prefer to write

iterator->Method();

The solution to this problem seems obvious thus-far. I need to implement the -> operator. This initially seemed straightforward, but I'm afraid that I'm missing something (probably something quite obvious)

At first, I tried:

Data ExpandingIterator::operator->() const{
    return **this; //Just call the regular dereference operator
}

But that left me with the error message: type 'Data' does not have an overloaded member 'operator ->'. A bit more research, and based on this answer here I believe there are a few possible solutions:

  1. Give 'Data' a -> operator too. Unfortunately I think this just pushes my misunderstanding of the problem down into the Data object where I'll run into the exact same problem I'm having here
  2. Return a pointer to a 'Data' object. Unfortunately, the Data object doesn't really exist in a location that can be pointed to so this seems like a non-starter
  3. Give Data an implicit object to pointer conversion - this seems nonsensical to me.
  4. Give up and go home. Just do (*Data).Method() and be bested by c++.

Can someone please help clear up misunderstanding (or lack of understanding) of how the -> operator is supposed to be implemented?

Community
  • 1
  • 1
Pete Baughman
  • 2,996
  • 2
  • 19
  • 33
  • `operator ->` should return a pointer to the instance you wish to apply the `->`, or a (reference to) object that implements the `->` overload. – jxh Sep 16 '13 at 21:51
  • 1
    Couldn't you just store an instance of Data in the iterator class and return a pointer to that instance? It seems like that might be a good idea anyways as that instance will basically serve as a cache if the iterator is ever accessed more than once. If your decompression step is expensive, that might matter. – dsharlet Sep 16 '13 at 21:54
  • @dsharlet That's not a terrible idea. The decompression step is not expensive, it just relies on keeping track of how many decompressed items you've already iterated over. Having said that, it wouldn't really hurt to keep the current data item around somewhere – Pete Baughman Sep 16 '13 at 21:59
  • @jxh The tricky part is that I'm not sure where the instance that I want to return a pointer to should reside. – Pete Baughman Sep 16 '13 at 22:04

2 Answers2

2

To overload operator->() you eventually need to arrive at an actually pointer, e.g., a T const* for a suitable type T. When you return something different than a pointer from operator->(), the compiler will call operator->() on the returned object.

From the sounds of it, it may be reasonable in your case to return a Data object from ExpandingIterator::operator->() which gets filled with the corresponding data at the given location. Of course, Data would need another operator->() which could return, e.g., a pointer to itself, i.e., this assuming that Data has the actual member functions you want to call.

In principle, you could have multiple indirections before you eventually arrive at an object you actually want to return a pointer to. However, it seems once you returned a Data object you have an entity in your hand you want to call a member function on and, thus, returning a pointer to it should work. Note, that the rules for temporaries still apply, i.e., if you return a Data object it will disappear at the end of the full expression.

Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380
  • Sounds like a vote for #1. It's probably OK for the data object to disappear at the end of the expression- it's immutable. Most of the methods either compute some value based on the data item, or convert it into some other type of object – Pete Baughman Sep 16 '13 at 22:01
2

You can define a helper class within your iterator that can delegate the -> operation to a Data pointer.

class ExpandingIterator {
    //...
    struct DataPtr {
        Data data;
        DataPtr () {}
        DataPtr (const Data &d) : data(d) {}
        Data * operator -> () { return &data; }
    }
    //...
    DataPtr operator -> () const { return **this; }
    //..
};
jxh
  • 69,070
  • 8
  • 110
  • 193
  • Does this buy anything that adding the -> operator the data class itself doesn't? – Pete Baughman Sep 16 '13 at 22:02
  • 2
    It limits the scope of changes to the iterator, you don't have to add silliness to your `Data` class. – jxh Sep 16 '13 at 22:03
  • If you are wondering where the `Data` instance should reside, it is a member embedded within the `DataPtr` temporary, as I illustrated. – jxh Sep 16 '13 at 22:05
  • This is sensible - and with a normal compiler it'll give identical code to adding the method to the Data class. – Nicholas Wilson Sep 16 '13 at 22:07