Can raw pointers be used instead of iterators with STL algorithms for containers with linear storage?

Question

I have a custom vector container that internally stores item a linear array. Last night, I was trying to implement custom iterators for my class to be able to use them with STL algorithms. I have had some success that you can see in here:

Live example with custom iterators

While doing so, I discovered I can merely pass raw pointers to STL algorithm and they just seem to work fine. Here's the example without any iterators:

#include <cstddef>
#include <iostream>
#include <iterator>
#include <algorithm>

template<typename T>
class my_array{
    T* data_;
    std::size_t size_;

public:

    my_array()
        : data_(NULL), size_(0)
    {}
    my_array(std::size_t size)
        : data_(new T[size]), size_(size)
    {}
    my_array(const my_array<T>& other){
        size_ = other.size_;
        data_ = new T[size_];
        for (std::size_t i = 0; i<size_; i++)
            data_[i] = other.data_[i];
    }
    my_array(const T* first, const T* last){
        size_ = last - first;
        data_ = new T[size_];

        for (std::size_t i = 0; i<size_; i++)
            data_[i] = first[i];
    }

    ~my_array(){
        delete [] data_;
    }
    const my_array<T>& operator=(const my_array<T>& other){
        size_ = other.size_;
        data_ = new T[size_];
        for (std::size_t i = 0; i<size_; i++)
            data_[i] = other.data_[i];
        return other;
    }
    const T& operator[](std::size_t idx) const {return data_[idx];}
    T& operator[](std::size_t& idx) {return data_[idx];}
    std::size_t size(){return size_;}

    T* begin(){return data_;}
    T* end(){return data_+size_;}
};

template<typename T>
void print(T t) {
    std::cout << t << std::endl;
}

int main(){


    typedef float scalar_t;
    scalar_t list [] = {1, 3, 5, 2, 4, 3, 5, 10, 10};
    my_array<scalar_t> a(list, list+sizeof(list)/sizeof(scalar_t));

    // works!
    for (scalar_t* it = a.begin(), *end = a.end();
         it != end; ++it)
        std::cout << ' ' << *it;
    std::cout << std::endl;

    // works!
    std::for_each(a.begin(), a.end(), print<scalar_t>);
    std::cout << std::endl;

    // works!
    my_array<int> b(a.size());
    std::copy(a.begin(), a.end(), b.begin());

    // works!
    scalar_t* end = std::remove(a.begin(), a.end(), 5);
    std::for_each(a.begin(), end, print<scalar_t>);
    std::cout << std::endl;

    // works!
    std::random_shuffle(a.begin(), end);
    std::for_each(a.begin(), end, print<scalar_t>);
    std::cout << std::endl;

    // works!
    std::cout << "Counts of 3 in array = " << std::count(a.begin(), end, 3) << std::endl << std::endl;

    // works!
    std::sort(a.begin(), end);
    std::for_each(a.begin(), end, print<scalar_t>);
    std::cout << std::endl;

    // works!
    if (!std::binary_search(a.begin(), a.end(), 5))
        std::cout << "Removed!" << std::endl;

    return 0;
}

Live example without iterators

My questions here are the following:

Does this always work for containers that have linear storage? I know that this would not work for linked-lists for example.
If they do work in this situation, why should I ever go through the hassle of implementing iterators anyway? I know how iterators generalize my code and whatnot, but if this simple array is all I ever need then I don't see the point.
What are the negative issues of what I'm doing if this approach would always work? For one thing, I can see I'm breaking data encapsulation.

score 23 · Accepted Answer · edited Feb 14 '14 at 15:30

23

One of the features of iterators being based on operator-overloading, is that pointers are already random-access iterators. This was a big design win in the early days of STL, as it made it easier to use algorithms with existing code (as well as making the interface more familiar to programmers). It's perfectly reasonable to wrap an array, add typedef T* iterator; typedef const T* const_iterator, return &array[0] from your begin() and &array[size] from your end(), and then use your container with any iterator-based algorithm. As you already realise, this will work for any container where elements are contiguous in memory (such as an array).

You might implement 'real' iterators if:

You have a different-shaped container (such as a tree or list);
You want to be able to resize the array without invalidating the iterators;
You want to add debugging checks to your iterator use, for example to check if the iterator is used after being invalidated or after the container has been deleted, or to bounds-check;
You want to introduce type-safety, and make sure people can't accidentally assign an arbitrary T* to a my_array::iterator.

I'd say this last advantage alone is well worth writing a trivial wrapper class for. If you don't take advantage of C++'s type system by making different kinds of thing have different types, you might as well switch to Javascript :-)

edited Feb 14 '14 at 15:30

yuri kilochek

12,709
2
32
59

answered May 08 '13 at 17:03

Dan Hulme

14,779
3
46
95

1

careful with `&array[size]` it has much more depth to it than one would think : http://stackoverflow.com/q/26420348/893406 – v.oddou Jul 14 '15 at 03:55
6

@v.oddou No, `&array[size]` is always valid (so long as you don't dereference it), as the answers in your link explain. You need to reason about what is and isn't correct code: you can't just use mysticism. – Dan Hulme Jul 14 '15 at 12:53
According to the whole discussion in the page I link, it is sufficient to load an address in a pointer register to cause a crash, if this address is unmapped; on some architectures. So it is not only a question of not dereferencing it. There is no cargo cult here. – v.oddou Jul 15 '15 at 01:28
5

@v.oddou That's true of invalid pointers, but the whole point of the question you linked is that the C++ standard defines that a pointer to one past the end of an array is valid. The question is about the effect that has on implementations; e.g. on a machine such as you describe, the runtime would have to ensure that there is one extra space in the memory that holds the array. The very first line of the question is, "C++ standard (and C for that matter) allows to create (not dereference though) a pointer to one element past the end of the array." – Dan Hulme Jul 15 '15 at 06:59
3

@v.oddou To avoid the worry about `&array[size]`, you can always just use `array + size` – Caleth Jan 08 '19 at 15:33

score 9 · Answer 2 · answered May 08 '13 at 16:59

Yes. See Effective STL, Item 16, which demonstrates with linear storage containers you can simply take the address of an item and work with that pointer as if it pointed to a simple array.
I think you answered your own question – you probably shouldn't, if you know the simple array is all you'll ever need.
Probably the biggest issue is just that – breaking data encapsulation. Consider whether or not an abstraction such as an explicit iterator type would buy you anything versus the cost.

"if you know the simple array is all you'll ever need" -> too much constraint. "is all you need NOW" is largely sufficient. Respect YAGNI and KISS for god's sake. Otherwise upvoted. — v.oddou, Jul 15 '15 at 01:31

Mark B · Answer 3 · 2013-05-08T18:06:21.950

4

It happens that pointers provide the interface required of random access iterators (dereference, increment, addition, difference, etc) and can be treated just like iterators.

It should always work for containers with contiguous storage.
You might wish to create your own iterators for the same reason you use methods instead of all public data in your classes: To encapsulate what's happening with an interface you can modify if you need to. As long as you typedef your T* to an iterator type this is probably not a significant issue. Additionally some algorithms may benefit from an iterator that's tagged with the iterator type, which you can't do for simple pointer types.

edited May 08 '13 at 18:06

answered May 08 '13 at 16:57

Mark B

95,107
10
109
188

2

A pointer is actually a perfectly valid iterator, and provides not just most of, but the *entire* interface required. The trick is that the functionality not provided directly by a pointer type (such as defining a `value_type` typedef) are provided by a specialization of the `std::iterator_traits` class template. Just FYI :) – jalf May 08 '13 at 17:13

score 4 · Answer 4 · answered May 08 '13 at 17:11

Does this always work for containers that have linear storage?

Yes, the iterator concepts were designed so that pointers could act as iterators over arrays.

If they do work in this situation, why should I ever go through the hassle of implementing iterators anyway?

There's no good reason to define your own iterator type in this situation, unless you want to do something like bounds-checking which can't be done with a simple pointer.

One slight benefit would be that you could include nested typedefs for the iterator's traits, as some of the standard iterator types do; but using pointers these are available from std::iterator_traits<T*> anyway.

What are the negative issues of what I'm doing if this approach would always work? For one thing, I can see I'm breaking data encapsulation.

To make the interface more consistent with STL-style containers, you should define iterator and const_iterator types (typedef aliases for the pointers), and provide const overloads of begin and end; and perhaps cbegin and cend for C++11 compatiblity.

There are various other requirements that you might want to conform to; see section 23.2 of the C++ standard for the gory details. But in general, it's more important to make iterators conform to their requirements, since STL-style algorithms work with iterators rather than containers, and by using pointers you already conform to those requirements.

Can raw pointers be used instead of iterators with STL algorithms for containers with linear storage?

4 Answers4

Linked

Related