Writing a C++ interface to a dynamically allocated C struct

Question

Introduction: I am writing a C++11 application which makes extensive use of a legacy C code base. A very common pattern in the legacy code is the existence of some struct LegacyStruct which is constructed and destroyed by methods such as

build_struct(LegacyStruct *L, int arg1, int arg2)
free_struct(LegacyStruct *L)

which are basically the constructor/destructor. The model of ownership in the legacy codebase is very unique_ptr-esque, so I aim to wrap it in a memory safe, RAII-minded wrapper class as follows:

class Wrapper {
public:
    Wrapper::Wraper() : handle() {}
    Wrapper::Wrapper(int same_arg1, int same_arg2);
    Wrapper::Wrapper(const Wrapper &W) = delete;
    Wrapper::Wrapper(Wrapper &&W) : handle(std::move(W.handle)) {}
    //copy operator= and move operator= analogously
private:
    std::unique_ptr<LegacyStruct, custom_deleter> handle;

where custom_deleter calls free_struct along the lines of this question, or just a partial specialization of std::default_delete for LegacyStruct. Anyway so far so good, I think this is a common design pattern and it suits my needs well.

My question: I am having trouble adapting this pattern to the case where I am dealing with a linked list type structure of the form

typedef struct LegacyNode {
    int stack_allocated_data;
    OtherStruct *heap_allocated_data;
    LegacyNode *next;
} LegacyNode;

Again, the ownership model within the legacy code base is unique_ptr-esque: there is sole ownership of the linked list, i.e., of responsibility for freeing it appropriately. Similarly, there is a corresponding free_node(LegacyNode *N) function which frees heap_allocated_data, if necessary, and then frees the node itself.

The construction situation is quite different though. There will be a function which looks like

build_list(LegacyNode **L, int *count_p, int other_args){
    LegacyNode *newnode;

    //code allocating newnode and populating its fields

    //...and then:
    newcut->next = *L;
    *L = newcut;
    (*count_p)++;
}

And a call to build_list looks like

int list_count = 0;
LegacyNode *L = (LegacyNode *) NULL;

build_list(&L, &list_count, 99);

Edit/clarification: build_list is a static, non-exported function in the codebase, which I access by calling some other function which calls build_list possibly several times.

Thus, I would like to write a ListWrap class which stores a head node and a list length, and has copy/move operators identical to Wrapper above, i.e., there is sole ownership over the list itself, it can be moved but not copied, etc.

However, my understanding is that smart pointers are not an option in this case. With head_node as some smart pointer to a LegacyNode, I would have to pass &head_node.get() to build_list, which would corrupt the smart pointer invariants/ownership?

As it stands, my wrapper class contains a raw pointer to a head node, a method which returns the address of the head node for use by build_list, a destructor which iterates through the list calling free_node, and predicate-based erase-type method which only deletes certain elements.

Of course, modifying and clearing a linked list is CS-101 level stuff, but I still managed to waste a couple hours writing it and having memory leaks all over the place! Also, there are several other linked list structures within the legacy codebase with nearly identical usage, so I would love to be able to turn this into a class template which can be specialized with a type and a deleter, and inherited from to provide type-specific methods.

Thanks

@Yakk very true, it might have been better wording for me to say this is what I might naively want to do, but it would be a terrible idea and is disallowed as such! — Lia Stratopoulos, Dec 06 '16 at 19:52

score 2 · Accepted Answer · answered Dec 06 '16 at 19:48

However, my understanding is that smart pointers are not an option in this case. With head_node as some smart pointer to a LegacyNode, I would have to pass &head_node.get() to build_list, which would corrupt the smart pointer invariants/ownership?

Yes, that is correct, as build_list would overwrite the object at that memory location, corrupting that memory for the smart pointer. But there is another way, you can construct a std::unique_ptr with an existing pointer!

So, instead of ListWrap allocating its own objects, you have build_list allocate the objects, and then just take the pointers and wrap them up with RAII.

class ListWrap {
public:
    ListWrap(LegacyNode* head, int count);
    //...
private:
    std::unique_ptr<LegacyNode, &free_node> handle;
    int count;
};

ListWrap::ListWrap(LegacyNode* head, int count) : handle{ head }, count{ count } {}

Thank you! For some reason it did not occur to me to take ownership of the list _after_ it has been constructed, but this fits my use case(s) much better. — Lia Stratopoulos, Dec 06 '16 at 20:03

Yakk - Adam Nevraumont · Answer 2 · 2016-12-06T20:38:54.030

Here is a bunch of nodes:

struct Nodes {
  struct DeleteAllNodes {
    void operator()(LegacyNode* node)const {
      while (auto cur = node) {
        node = cur->next;
        free_node(node);
      }
    }
  };
  std::unique_ptr<LegacyNode, DeleteAllNodes> m_nodes;
};

and here are some operations. Most of them keep things things managed, except for short windows I have commented:

void push_node( Nodes& nodes, int other_args ) {
  int unused = 0;
  auto* tmp = nodes.m_nodes.get();
  build_list( &tmp, &unused, other_args );
  nodes.m_nodes.release(); // unmanaged
  nodes.m_nodes.reset(tmp); // everything managed now
}
Nodes pop_node( Nodes& nodes ) {
  if (!nodes.m_nodes) return {};
  auto* tmp = nodes.m_nodes->next; // unmanaged
  nodes.m_nodes->next = nullptr;
  Nodes retval;
  retval.m_nodes.reset(tmp); // everything managed now
  std::swap( retval.m_nodes, nodes.m_nodes );
  return retval;
}
void move_single_node( Nodes& dest, Nodes& src ) {
  Assert(src.m_nodes);
  if (!src.m_nodes) return;
  Nodes to_push = pop_node(src);
  LegacyNode** next = &(to_push.m_nodes->next);
  Assert(!*next); // shouldn't be possible, pop_node returns a single node
  *next = dest.m_nodes.release(); // unmanaged for a short period
  dest = std::move(to_push);
}
Nodes splice( Nodes backwards, Nodes forwards ) {
  while(backwards.m_nodes) {
    move_single_node( forwards, backwards );
  }
  return forwards;
}
template<class F>
void erase_if( Nodes& nodes, F&& f, Nodes prefix={} ) {
  if (!nodes.m_nodes) {
    return splice( std::move(prefix), std::move(nodes) );
  }
  Nodes tmp = pop_node( nodes );
  if ( !f(*tmp.m_nodes) ) {
    prefix = splice( std::move(tmp), prefix );
  }
  erase_if( nodes, std::forward<F>(f), std::move(prefix) );
}

The ones that take Nodes& as a first argument can be methods of Nodes.

Thank you for this answer. I have edited my question to clarify that I do not have access to `build_list` from outside the code base, rather it is a helper function that is called several times by the function that I call, in the manner which I described in the question. That being said, there are a lot of helpful ideas here which I _will_ employ in my code, in particular the splicing operation and `pop`/`remove_if`, and possibly giving the head node an iterative deleter. — Lia Stratopoulos, Dec 06 '16 at 20:48

Writing a C++ interface to a dynamically allocated C struct

2 Answers2