4

I am practicing implementing containers. My goal is to define the iterators begin() and end() so that I can have loops in the form of for(auto x : v). My container looks like this:

class Vector{
public:
    Vector(initializer_list<double> numbers){
        sz = numbers.size();
        elem = new double[sz];
        int i = 0;
        for (auto it = numbers.begin(); it!=numbers.end(); ++it)
            elem[i++] = *it;
    }
    ~Vector(){delete [] elem;}
    double* begin();
    double* end();
private:
    double* elem;
    int sz;
    
};

Option 1

This is how I have defined the iterators (and they work perfectly fine in my test cases)

double* Vector::begin(){
    return elem;
}

double* Vector::end(){
    return &elem[sz];
}

Option 2

This is how they are defined in A Tour of C++

double* Vector::begin(){
    return &elem[0];
}

double* Vector::end(){
    return &elem[0]+sz;
}

My question

As far as I can see both options work fine (assuming the container is non-empty). Does Option 2 have any advantages compared to Option 1 (and vice versa)? I appreciate any suggestions.

iamvegan
  • 482
  • 4
  • 15
  • 3
    `double* Vector::end() { return elem + sz; }`? – Ted Lyngmo Jun 23 '20 at 19:06
  • 2
    On a side note, your `Vector` class is violating the [Rule of 3/5/0](https://en.cppreference.com/w/cpp/language/rule_of_three) by not implementing/deleting a copy constructor and copy assignment operator, and a move constructor and move assignment operator. Other than as a learning experience, there is no real benefit to implementing a custom `Vector` class rather than using `std::vector`. – Remy Lebeau Jun 23 '20 at 19:14

2 Answers2

9

While &elem[sz] and &elem[0]+sz will wind up giving you the same result on most/all systems, the first is actually undefined behavior. When you do

&elem[sz]

you are actually doing

&*(elem +sz)

and that *, the dereference, is to an element that doesn't exist. That is undefined behavior per the C++ standard.

With

&elem[0]+sz

you get a pointer to the first element which is legal provided the pointer points to an actual array, and then you advance it to be one past the end. This is a legal and a correct way to get the end iterator provided elem is not null and points to a valid array.


Another way to do this is to just use

return elem + sz;

as it doesn't require any dereferencing.

cigien
  • 57,834
  • 11
  • 73
  • 112
NathanOliver
  • 171,901
  • 28
  • 288
  • 402
  • 1
    "*With `&elem[0]+sz` you get a pointer to the first element which is legal*" - provided `elem` is not null and `sz` is not 0, otherwise `elem[0]` aka `*elem` would be dereferencing a non-existing element. `return elem + sz;` is preferred in this situation, no dereferencing occurs at all. – Remy Lebeau Jun 23 '20 at 19:10
  • I may be mistaken about this, but I seem to remember reading somewhere that the spec specifically says that `&*` is a no-op. Though perhaps that's C rather than C++? – templatetypedef Jun 23 '20 at 19:10
  • @templatetypedef I haven't seen anything in the C++ standard about it being a non-op. A goo compiler will make it one, but I don't believe it is guaranteed. – NathanOliver Jun 23 '20 at 19:12
  • @NathanOliver I'd expect even a bad compiler to figure that one out. But I wouldn't consider it guaranteed unless someone can quote the standard. – Mark Ransom Jun 23 '20 at 19:14
  • @RemyLebeau Goood points. I've updated the answer. – NathanOliver Jun 23 '20 at 19:15
  • 1
    @templatetypedef [This](https://timsong-cpp.github.io/cppwp/expr.unary.op) is the section that covers `*` and `&` for the purposes of this question and no where does it say doing `&*` is a non-op so at best it is a compiler optimization. – NathanOliver Jun 23 '20 at 19:18
  • 1
    There was also a [discussion](https://stackoverflow.com/a/61238068/7582247) touching this and a [follow-up question](https://stackoverflow.com/questions/61238781/forcing-a-decay-of-the-array-for-the-lack-of-better-title) regarding whether `double& endref = &elem[sz];` is actually UB if `endref` is never dereferenced but only has its address taken. Neither got a definite answer. – Ted Lyngmo Jun 23 '20 at 19:31
  • 1
    @TedLyngmo Interesting. The more I look at the text, the more it puzzles me as well. `*` says it needs to be a pointer to an object type, not a object. The only other thing I can find is *Otherwise, if the operand is an lvalue of type T* from [expr.unary.op] ad I'm not sure if *is an lvalue of type T* applies to the hypothetical end object. – NathanOliver Jun 23 '20 at 19:49
  • @NathanOliver Yes, I found it very interesting too. I read about [Access outside of lifetime](https://en.cppreference.com/w/cpp/language/lifetime#Access_outside_of_lifetime) to see if there is some support for creating a reference to an "object" before the storage of the object has been allocated ... but I failed. The very first sentence seems to imply that storage must be allocated first. – Ted Lyngmo Jun 23 '20 at 20:01
3

Both of these options work, and I would be astonished if the compiler didn't generate equivalent code for each version.

I actually prefer your implementation of begin, since if elem is a pointer, then &elem[0] is redundant compared to elem.

Another option: for end, you could do something like

return begin() + size();

assuming that you have a size() member function. This doesn't require any addresses to be taken and more directly says "the position that's size() steps down from where begin() points." But that's just my opinion. :-)

Hope this helps!

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • Inside of `end()`, `return elem + sz;` would be cleaner, and avoid any unnecessary function calls. – Remy Lebeau Jun 23 '20 at 19:12
  • The advice I've gotten over the years has been to use public interfaces when possible in class implementations, with the idea of making the class easier to edit / modify in the future and to clarify intent. (I think I originally saw that in "Effective C++.") But perhaps this isn't as standard as I thought it was? – templatetypedef Jun 23 '20 at 19:15