0

Short version

Can I reinterpret_cast a std::vector<void*>* to a std::vector<double*>*?

What about with other STL containers?

Long version

I have a function to recast a vector of void pointers to a datatype specified by a template argument:

template <typename T>
std::vector<T*> recastPtrs(std::vector<void*> const& x) {
    std::vector<T*> y(x.size());
    std::transform(x.begin(), x.end(), y.begin(),
            [](void *a) { return static_cast<T*>(a); } );
    return y;
}

But I was thinking that copying the vector contents isn't really necessary, since we're really just reinterpreting what's being pointed to.

After some tinkering, I came up with this:

template <typename T>
std::vector<T*> recastPtrs(std::vector<void*>&& x) {
    auto xPtr = reinterpret_cast<std::vector<T*>*>(&x);
    return std::vector<T*>(std::move(*xPtr));
}

So my questions are:

  • Is it safe to reinterpret_cast an entire vector like this?
  • What if it was a different kind of container (like a std::list or std::map)? To be clear, I mean casting a std::list<void*> to std::list<T*>, not casting between STL container types.
  • I'm still trying to wrap my head around move semantics. Am I doing it right?

And one follow-up question: What would be the best way to generate a const version without code duplication? i.e. to define

std::vector<T const*> recastPtrs(std::vector<void const*> const&);
std::vector<T const*> recastPtrs(std::vector<void const*>&&);

MWE

#include <vector>
#include <algorithm>
#include <iostream>

template <typename T>
std::vector<T*> recastPtrs(std::vector<void*> const& x) {
    std::vector<T*> y(x.size());
    std::transform(x.begin(), x.end(), y.begin(),
            [](void *a) { return static_cast<T*>(a); } );
    return y;
}

template <typename T>
std::vector<T*> recastPtrs(std::vector<void*>&& x) {
    auto xPtr = reinterpret_cast<std::vector<T*>*>(&x);
    return std::vector<T*>(std::move(*xPtr));
}

template <typename T>
void printVectorAddr(std::vector<T> const& vec) {
    std::cout<<"  vector object at "<<&vec<<", data()="<<vec.data()<<std::endl;
}

int main(void) {
    std::cout<<"Original void pointers"<<std::endl;
    std::vector<void*> voidPtrs(100);
    printVectorAddr(voidPtrs);

    std::cout<<"Elementwise static_cast"<<std::endl;
    auto dblPtrs = recastPtrs<double>(voidPtrs);
    printVectorAddr(dblPtrs);

    std::cout<<"reintepret_cast entire vector, then move ctor"<<std::endl;
    auto dblPtrs2 = recastPtrs<double>(std::move(voidPtrs));
    printVectorAddr(dblPtrs2);
}

Example output:

Original void pointers
  vector object at 0x7ffe230b1cb0, data()=0x21de030
Elementwise static_cast
  vector object at 0x7ffe230b1cd0, data()=0x21de360
reintepret_cast entire vector, then move ctor
  vector object at 0x7ffe230b1cf0, data()=0x21de030

Note that the reinterpret_cast version reuses the underlying data structure.

Previously-asked questions that didn't seem relevant

These are the questions that come up when I tried to search this:

reinterpret_cast vector of class A to vector of class B

reinterpret_cast vector of derived class to vector of base class

reinterpret_cast-ing vector of one type to a vector of another type which is of the same type

And the answer to these was a unanimous NO, with reference to the strict aliasing rule. But I figure that doesn't apply to my case, since the vector being recast is an rvalue, so there's no opportunity for aliasing.

Why I'm trying to do this

I'm interfacing with a MATLAB library that gives me data pointers as void* along with a variable indicating the datatype. I have one function that validates the inputs and collects these pointers into a vector:

void parseInputs(int argc, mxArray* inputs[], std::vector<void*> &dataPtrs, mxClassID &numericType);

I can't templatize this part since the type is not known until runtime. On the other side, I have numeric routines to operate on vectors of a known datatype:

template <typename T>
void processData(std::vector<T*> const& dataPtrs);

So I'm just trying to connect one to the other:

void processData(std::vector<void*>&& voidPtrs, mxClassID numericType) {
    switch (numericType) {
        case mxDOUBLE_CLASS:
            processData(recastPtrs<double>(std::move(voidPtrs)));
            break;
        case mxSINGLE_CLASS:
            processData(recastPtrs<float>(std::move(voidPtrs)));
            break;
        default:
            assert(0 && "Unsupported datatype");
            break;
    }
}
KQS
  • 1,547
  • 10
  • 21
  • `processData` could accept the `vector` still and implicitly convert `void *` to `float *` at the point required. Also consider using begin-end ranges instead of vectors for `processData` (although that doesn't solve the problem) – M.M Oct 26 '17 at 02:15

2 Answers2

2

Given the comment that you're receiving the void * from a C library (something like malloc), it seems like we can probably narrow the problem down quite a bit.

In particular, I'd guess you're really dealing with something that's more like an array_view than a vector. That is, you want something that lets you access some data cleanly. You might change individual items in that collection, but you'll never change the collection as a whole (e.g., you won't try to do a push_back that could need to expand the memory allocation).

For such a case, you can pretty easily create a wrapper of your own that gives you vector-like access to the data--defines an iterator type, has a begin() and end() (and if you want, the others like rbegin()/rend(), cbegin()/cend() and crbegin()/crend()), as well as an at() that does range-checked indexing, and so on.

So a fairly minimal version could look something like this:

#pragma once
#include <cstddef>
#include <stdexcept>
#include <cstdlib>
#include <iterator>

template <class T> // note: no allocator, since we don't do allocation
class array_view {
    T *data;
    std::size_t size_;
public:
    array_view(void *data, std::size_t size_) : data(reinterpret_cast<T *>(data)), size_(size_) {}

    T &operator[](std::size_t index) { return data[index]; }
    T &at(std::size_t index) { 
        if (index > size_) throw std::out_of_range("Index out of range");

        return data[index];
    }


    std::size_t size() const { return size_; }

    typedef T *iterator;
    typedef T const &const_iterator;
    typedef T value_type;
    typedef T &reference;

    iterator begin() { return data; }
    iterator end() { return data + size_; }

    const_iterator cbegin() { return data; }
    const_iterator cend() { return data + size_; }

    class reverse_iterator {
        T *it;
    public:
        reverse_iterator(T *it) : it(it) {}

        using iterator_category = std::random_access_iterator_tag;
        using difference_type = std::ptrdiff_t;
        using value_type = T;
        using pointer = T *;
        using reference = T &;

        reverse_iterator &operator++() { 
            --it;
            return *this;
        }

        reverse_iterator &operator--() {
            ++it;
            return *this;
        }

        reverse_iterator operator+(size_t size) const { 
            return reverse_iterator(it - size);
        }

        reverse_iterator operator-(size_t size) const { 
            return reverse_iterator(it + size);
        }

        difference_type operator-(reverse_iterator const &r) const { 
            return it - r.it;
        }

        bool operator==(reverse_iterator const &r) const { return it == r.it; }
        bool operator!=(reverse_iterator const &r) const { return it != r.it; }
        bool operator<(reverse_iterator const &r) const { return std::less<T*>(r.it, it); }
        bool operator>(reverse_iterator const &r) const { return std::less<T*>(it, r.it); }

        T &operator *() { return *(it-1); }
    };

    reverse_iterator rbegin() { return data + size_; }
    reverse_iterator rend() { return data; }    
};

I've tried to show enough that it should be fairly apparent how to add most of the missing functionality (e.g., crbegin()/crend()), but I haven't worked really hard at including everything here, since much of what's left is more repetitive and tedious than educational.

This is enough to use the array_view in most of the typical vector-like ways. For example:

#include "array_view"
#include <iostream>
#include <iterator>

int main() { 
    void *raw = malloc(16 * sizeof(int));

    array_view<int> data(raw, 16);

    std::cout << "Range based:\n";
    for (auto & i : data)
        i = rand();

    for (auto const &i : data)
        std::cout << i << '\n';

    std::cout << "\niterator-based, reverse:\n";
    auto end = data.rend();
    for (auto d = data.rbegin(); d != end; ++d)
        std::cout << *d << '\n';

    std::cout << "Forward, counted:\n"; 
    for (int i=0; i<data.size(); i++) {
        data[i] += 10;
        std::cout << data[i] << '\n';
    }
}

Note that this doesn't attempt to deal with copy/move construction at all, nor with destruction. At least as I've formulated it, the array_view is a non-owning view into some existing data. It's up to you (or at least something outside of the array_view) to destroy the data when appropriate. Since we're not destroying the data, we can use the compiler-generated copy and move constructors without any problem. We won't get a double-delete from doing a shallow copy of the pointer, because we don't do any delete when the array_view is destroyed.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • In my case, I'm not reinterpreting a `void *` as an `int[]`, but rather converting a container of `void *` to a container of `int *`. For my use case (given a `std::vector voidPtrs;`) would it be fair to construct an `array_view typedPtrs(voidPtrs.data(), voidPtrs.size())` (with caveats about destructing the original `voidPtrs` vector)? – KQS Oct 26 '17 at 16:41
  • @KQS: Chances are that you can get away with it in a fair number of cases, but it's not really kosher, so to speak. It's entirely allowable for `std::vector` for some particular `T` to be a specialization that works entirely differently from a normal `vector`. The obvious example is `std::vector`, which is (normally) entirely different from `std::vector`, even when `bool` and `int` are the same size. – Jerry Coffin Oct 26 '17 at 16:44
  • P.S. Do you think this would be an appropriate solution to this problem: https://stackoverflow.com/questions/2434196/how-to-initialize-stdvector-from-c-style-array ? – KQS Oct 26 '17 at 16:45
  • @KQS: Maybe. If he really just wants read-only access to the original data, then yes this could work (and would clearly reduce overhead compared to almost anything that copies the data into a vector). If he wants to use the result as a real vector where he might (for example) use `push_back` or `resize` to add more elements to the collection, then he's kind of stuck with using a real vector (or some hybrid that stores a pointer to existing data, space for new elements, and an index to track what's where--maybe reasonable if elements are large/expensive to copy). – Jerry Coffin Oct 26 '17 at 16:55
0

No, you cannot do anything like this in Standard C++.

The strict aliasing rule says that to access an object of type T, you must use an expression of type T; with a very short list of exceptions to that.

Accessing a double * via a void * expression is not such an exception; let alone a vector of each. Nor is it an exception if you accessed the object of type T via an rvalue.

M.M
  • 138,810
  • 21
  • 208
  • 365
  • I'm getting these `void*` from a C library (something like `malloc`); what would be the recommended way to work with them then? – KQS Oct 26 '17 at 01:33
  • @KQS there's not really enough information in your question to answer that. Maybe you don't need to actually create the vector of `double *` – M.M Oct 26 '17 at 01:35
  • I don't need to maintain vectors of both types (that was the point I was trying to make about the rvalues). Once I've cast them from `void*` to `double*` I'm done with the `void*` vector – KQS Oct 26 '17 at 01:35
  • @KQS a C library would not return a `std::vector` in the first place – M.M Oct 26 '17 at 01:37
  • I've added more information to my question to explain the motivation behind this. – KQS Oct 26 '17 at 02:05