2

I have a complex class, that holds a big block of double[2]-type data managed by a smart pointer like: std::unique_ptr<double[2]> m_data; I cannot change the type of the data structure.

I am using a library that gives me a function with the following signature: bool func_in_lib(std::vector<double>& data, double& res). I cannot change the signature of this function.

I want to pass the data managed by the unique_ptr to the function expecting a vector<double>& without breaking the connection to my complex class. I want the function to work directly on my m_data and not copy the data into a std::vector<double> and the copy it back into my complex class, because I have to do this a lot of times.

Is there any way to do this?


Here is some code that covers the semantic I want to have. The code line of my concern is

vector<double> access_vec = /* give access to my_data via vector interface */;


#include <iostream>
#include <memory>
#include <vector>

using namespace std;

//--------------------------------------------------------------------------//
//--- This function is given, I cannot change its signature.
bool
func_in_lib(std::vector<double>& data, double& res) {
  //--- check some properties of the vector
  if (data.size() < 10)
    return false;
  //--- do something magical with the data
  for (auto& d : data)
    d *= 2.0;
  res = 42.0;
  return true;
}

//--------------------------------------------------------------------------//
struct DataType {
  double a = 1.0;
  double b = 2.0;
  double c = 3.0;
};

//--------------------------------------------------------------------------//
ostream&
operator<<(ostream& out, const DataType& d) {
  out << d.a << " " << d.b << " " << d.c << endl;
  return out;
}

//--------------------------------------------------------------------------//
int
main(int argc, char const* argv[]) {
  int count = 20;
  //--- init and print my data
  unique_ptr<DataType[]> my_data = make_unique<DataType[]>(count);
  for (int i = 0; i < count; ++i)
    cout << my_data.get()[i];
  //---
  double         result     = 0.0;
  vector<double> access_vec = /* give access to my_data via vector interface */;
  func_in_lib(access_vec, result);

  return 0;
}

Thomas Wilde
  • 801
  • 7
  • 15
  • The purpose of unique_ptr will be dead if some other variable or pointer changes its value. why not use the raw pointer or shared for this particular case. – Umar Farooq Jun 19 '20 at 09:36
  • "*I want the function to work directly on my m_data and not copy the data into a std::vector*" - `std::vector` is a class and there isn't any way to convert a raw `unique_ptr` array to a `std::vector` other than copying it. (It doesn't take values by references either) Maybe you could use `std::vector>` but that would require you to change the signature but `std::vector`, as far as I know, will not be able to achieve what you want. – Ruks Jun 19 '20 at 09:44
  • @Umar: I do not want to pass the unique_ptr, I only want to pass the data. I can access the raw pointer with `my_data.get()`. The next step is my problem. I cannot pass this raw pointer to the function, because it expects a reference to a vector. I do not know how to "convert" or cast this pointer into a `std::vector` without copying it. – Thomas Wilde Jun 19 '20 at 09:46
  • @Ruks: I am afraid you are right. – Thomas Wilde Jun 19 '20 at 09:48
  • 1
    Why is a vector passed to `func_in_lib` passed by _non-const_ reference? Is that function supposed to fill/update the vector? – Daniel Langr Jun 19 '20 at 10:43
  • @Daniel: Yes it is. The function in the `func_in_lib` runs some optimization algorithm and manipulates the content of the vector. – Thomas Wilde Jun 19 '20 at 10:54
  • @ThomasWilde Then, there is no way how to "connect" it to a unique pointer. I see the only option in copying. Your data structure is simply not compatible with that function. – Daniel Langr Jun 19 '20 at 11:03
  • Short answer: no, there is no way to do what you want. C++ does not work this way. `std::vector` always manages its own memory, and will allocate its own memory for its contents. There is no way to "borrow" someone else's memory for a `std::vector`. One of vector's methods is called `resize()`, I'm sure you know what it does. What do you expect this `resize()` to do, with a fixed size array of two doubles, that cannot be changed? Just this simple thought experiment should make you realize that you will need to find some other solution for what you want to do, this cannot be done with a vector. – Sam Varshavchik Jun 19 '20 at 11:10
  • If you can't change `func_in_lib` then what can you change? `DataType` and/or the class that manages it? Could the implementation of `DataType` be changed such that it's really just a proxy/view over 3 contiguous doubles in a `std::vector`? – G.M. Jun 19 '20 at 11:16
  • @G.M. I think this is the best solution. I *can* change the class that manages `DataType`. So I have to rewrite this class to use a `std::vector` instead of a `std::uniqur_ptr`. – Thomas Wilde Jun 19 '20 at 11:38

3 Answers3

2

tl;dr: Not possible in a standard-compliant way.

It's actually almost possible, but std::allocator limitations block your way. Let me explain.

  • An std::vector "owns" the memory it uses for element storage: A vector has the right to delete[] the memory (e.g. on destruction, or destruction-after-move, or a .resize(), or a push_back etc.) and reallocate elsewhere. If you want to maintain ownership by your unique_ptr, you can't allow that to happen. And while it's true your mock-implementation of func_in_lib() doesn't do any of that - your code can't make these assumptions because it must cater to the function's declaration, not it's body.

But let's say you're willing to bend the rules a little, and assume that the vector won't replace its allocated memory while running. This is legitimate, in the sense that - if you were able to pass the memory for the vector to use somehow, and it replaced the memory region, you could detect that when func_in_lib() returns, and then either fix things in the unique_ptr or throw an exception (depending on whether other places in your code hold a pointer to the discarded memory). Or - let's suppose that func_in_lib() took a const std::vector<double[2]>& instead of a non-const reference. Our path would still be blocked. Why?

  • std::vector manages memory through an allocator object. The allocator is a template, so in theory you could use a vector where the allocator does whatever you want - for example, starting with pre-allocated memory (which you give it - from unique_ptr::get(), and refusing to ever reallocate any memory, e.g. by throwing an exception. And since one of the std::vector constructors takes an allocator of the appropriate type - you could construct your desired allocator, create a vector with it, and pass a reference to that vector.

But alas - your library is cruel. func_in_lib isn't templated and can only take the default template parameter for its allocator: std::allocator.

  • The default allocator used for std::vector and other standard library containers is std::allocator. Now, allocators are a crooked idea generally, in my opinion; but std::allocator is particularly annoying. Specifically, it can't be constructed using a pre-existing memory region for it to use; it only ever holds memory it has allocated itself - never memory you gave it.

So, you'll never get an std::vector to use the memory you want to.

So what to do?

  1. Option 1: Your hack:

    • Figure out the concrete layout of std::vector on your system
    • Manually set field values to something useful
    • Use reinterpret_cast<std::vector>() on your raw data.
  2. Option 2: malloc() and free() hooks (if you're on a Unix-like system and/or using a compiled which uses libc)

    • See: Using Malloc Hooks

      the idea is to detect the new[] call from the std::vector you create, and give it your own unique_ptr-controlled memory instead of actually allocating anything. And when the vector asks to free the memory (e.g. on destruction), you do nothing.

  3. Switch libraries. The library exposing func_in_lib is poorly written. Unless it is a very niche library, I'm sure there are better alternatives. In fact, peharps you could do better job writing it yourself.

  4. Don't use that particular function in the library; stick to lower-level, simple primitives in the library and implement func_in_lib() using those. Not always feasible, but may be worth a short.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
  • Thank you for the detailed answer. Here are some thoughts of mine: 2) that could be a solution because I work on a Linux system. 3) FYI it is the ["NonLinear Optimization"](https://nlopt.readthedocs.io/en/latest/NLopt_Deprecated_API_Reference/) library. The guys do a great job implementing the optimization algorithms. I currently use the C++ interface. There is also a C interface, that gives a pointer/memory based interface. 4) I thought about that. Unfortunately most of the code is placed into one giant object type, with a lot private members. Its easier to reimplement *my* classes. – Thomas Wilde Jun 19 '20 at 16:01
  • @ThomasWilde: So, it's possible that the library itself is great, but the C++ wrappers are poor. A known phenomenon. However... it seems [the library is FOSS](https://github.com/stevengj/nlopt/), so you could just use a custom version of it. Or you could ask them to change that interface to taking a pair of iterators, or an `std::span`, or even a pointer+size if they want no templates but compatibility with older C++ versions. – einpoklum Jun 19 '20 at 16:24
1

With a colleague of mine I found two solutions, that solve my problem.

Solution 1 - The hacky one

The idea is to use the structure of the underlying implementation of the std::vector<double>, which consists in my case of 3 members containing 3 pointers to the data of the vector.

  1. start address of the data section
  2. end address of the data section
  3. address of the current maximum capacity of the data section

So I build a struct containing these three addresses and use a reinterpret_cast to a std::vector. This works with the current implementation of std::vector on my machine. This implementation can vary, depending on the installed version of the STL.

The nice thing here is, that I can use the interface of std::vector without creating it. I also do not have to copy the data into a std::vector. I could also take a just part from the initial data stored in my complex class. I can control the manipulated part, by the pointers I send to the struct.


This solves my problem, but it is a hack. I can use it, because the code is only relevant for myself. I still post it, because it could be of interest for others.

#include <iostream>
#include <memory>
#include <vector>

using namespace std;

//--------------------------------------------------------------------------//
//--- This function is given, I cannot change its signature.
bool
func_in_lib(std::vector<double>& data, double& res) {
  //--- check some properties of the vector
  if (data.size() < 10)
    return false;
  //--- do something magical with the data
  for (auto& d : data)
    d *= 2.0;

  res = 42.0;
  return true;
}

//--------------------------------------------------------------------------//
struct DataType {
  double a = 1.0;
  double b = 2.0;
  double c = 3.0;
};

//--------------------------------------------------------------------------//
ostream&
operator<<(ostream& out, const DataType& d) {
  out << d.a << " " << d.b << " " << d.c << endl;
  return out;
}

//--------------------------------------------------------------------------//
int
main(int argc, char const* argv[]) {
  int count = 20;
  //--- init and print my data
  unique_ptr<DataType[]> my_data = make_unique<DataType[]>(count);
  for (int i = 0; i < count; ++i)
    cout << my_data.get()[i];
  
  //--------------------------------------------------------------------------//
  // HERE STARTS THE UGLY HACK, THAT CAN BE ERROR-PRONE BECAUSE IT DEPENDS ON
  // THE UNDERLYING IMPLEMENTATION OF std::vector<T>
  //--------------------------------------------------------------------------//
  struct VecAccess {
    double* start = nullptr; // address to the start of the data
    double* stop0 = nullptr; // address to the end of the data
    double* stop1 = nullptr; // address to the capacity of the vector
  };

  //---
  DataType*       p_data = my_data.get();
  VecAccess       va{ &(p_data[0].a),                //points at the 'front' of the vector
                      &(p_data[count - 1].c) + 1,    //points at the 'end' of the vector
                      &(p_data[count - 1].c) + 1 };
  vector<double>* p_vec_access = reinterpret_cast<vector<double>*>(&va);
  //--------------------------------------------------------------------------//
  // HERE ENDS THE UGLY HACK.
  //--------------------------------------------------------------------------//

  //---
  double dummy = 0.0;   // this is only relevant for the code used as minimum example
  func_in_lib(*p_vec_access, dummy);

  //--- print the modified data
  for (int i = 0; i < count; ++i)
    cout << my_data.get()[i];

  return 0;
}


Update: Analyzing the assembler code of the second solution shows, that a copy of the content is performed, even though the copy-constructor of the data objects is not called. The copy process happens at machine code level.

Solution 2 - The move semantic

For this solution I have to mark the Move-Constructor of DataType with noexcept. The key idea is not to treat the DataType[] array as a std::vector<double>. Instead we treat the std::vector<double> as a std::vector<DataType>. We can then move the data into this vector (without copying), send it to the function, and move it back afterwards.

The data is not copied but moved std::vector, which is faster. Also relevant for my case I can again take a just part from the initial data stored in my complex class. Drawback with this solution I have to create an additional storage for the moved data with the correct size.

#include <iostream>
#include <memory>
#include <utility>
#include <vector>

using namespace std;

//--------------------------------------------------------------------------//
//--- This function is given, I cannot change its signature.
bool
func_in_lib(std::vector<double>& data, double& res) {
  //--- check some properties of the vector
  if (data.size() < 10)
    return false;
  //--- do something magical with the data
  for (auto& d : data)
    d *= 2.0;

  res = 42.0;
  return true;
}

//--------------------------------------------------------------------------//
class DataType {
public:
  double a = 1.0;
  double b = 2.0;
  double c = 3.0;

  // clang-format off
  DataType() = default;
  DataType(DataType const&) = default;
  DataType(DataType&&) noexcept = default;
  DataType& operator=(DataType const&) = default;
  DataType& operator=(DataType&&) noexcept  = default;
  ~DataType()  = default;
  // clang-format on
};

//--------------------------------------------------------------------------//
ostream&
operator<<(ostream& out, const DataType& d) {
  out << d.a << " " << d.b << " " << d.c << endl;
  return out;
}

//--------------------------------------------------------------------------//
int
main(int argc, char const* argv[]) {
  int count = 20;
  //--- init and print my data
  unique_ptr<DataType[]> my_data = make_unique<DataType[]>(count);
  for (int i = 0; i < count; ++i)
    cout << my_data.get()[i];
  //---
  vector<double> double_vec;
  double_vec.reserve(count * 3);
  //--- here starts the magic stuff
  auto& vec_as_datatype = *reinterpret_cast<vector<DataType>*>(&double_vec);
  auto* start_mv        = &(my_data.get()[0]);
  auto* stop_mv         = &(my_data.get()[count]) + 1;
  //--- move the content to the vec
  move(start_mv, stop_mv, back_inserter(vec_as_datatype));
  //--- call the external func in the lib
  double dummy = 0.0; // is only needed for the code of the example
  func_in_lib(double_vec, dummy);
  //--- move the content to back
  move(begin(vec_as_datatype), end(vec_as_datatype), start_mv);
  //--- print modified the data
  for (int i = 0; i < count; ++i)
    cout << my_data.get()[i];
}
Thomas Wilde
  • 801
  • 7
  • 15
0

This is not a reasonable answer but nobody mentioned ( because it surely does not directly answer your question ) C++17 polymorphic allocator with std::pmr::vector in the sense that they can easily do half of the work.

But unfortunately it is not possible to come back to an usual std::vector

I also came accross an article of Bartek's coding blog from which I stole the code snippet below:

#include <iostream>
#include <memory_resource>   // pmr core types
#include <vector>            // pmr::vector
#include <cctype>

template <typename T> void MyToUpper(T& vec)    {
    for(auto & cr:vec)
        cr = std::toupper(cr);
}

//https://www.bfilipek.com/2020/06/pmr-hacking.html

int main() {
    char buffer[64] = {}; // a small buffer on the stack
    std::fill_n(std::begin(buffer), std::size(buffer) - 1, '_');
    std::cout << buffer << "\n\n";

    std::pmr::monotonic_buffer_resource pool{std::data(buffer), std::size(buffer)};

    std::pmr::vector<char> vec{ &pool };
    for (char ch = 'a'; ch <= 'z'; ++ch)
        vec.push_back(ch);
        
    std::cout << buffer << "\n\n";
    
    MyToUpper(vec);
    
    std::cout << buffer << '\n';
}

with potential result under coliru (note: c++17)

_______________________________________________________________

aababcdabcdefghabcdefghijklmnopabcdefghijklmnopqrstuvwxyz______

aababcdabcdefghabcdefghijklmnopABCDEFGHIJKLMNOPQRSTUVWXYZ______

The article mentioned that the garbage part (aababcdabcdefghabcdefghijklmnop) is due to vector data reallocation while growing.

But what is interesting here is that the operation performed on the vector was indeed done on the original buffer ( abcdefghijklmnopqrstuvwxyz => ABCDEFGHIJKLMNOPQRSTUVWXYZ )

Unfortunately the std::pmr::vector would not fit your function func_in_lib(std::vector<double>& data, double& res)

I think you bought the library and have no access to the code and can not recompile it, but on the contrary you could use templates or maybe just tell your provider to add using std::pmr::vector; at the beginning of its code...

NGI
  • 852
  • 1
  • 12
  • 31