0

I'm struggling for a few days now to find a solution to wrap a C struct containing multiple variable-sized int arrays (stored as pointers) in swig. Suppose the following minimal example:

typedef struct {
  size_t length;
  int    *a;
  int    *b;
} mystruct;

where both a and b are pointers to int arrays allocated somewhere in C. The size of both arrays is stored in the length member. Now, what I would really like to have is two-fold:

  1. access to a and b members in objects of type mystruct should be safe, i.e. exceptions should be thrown if index is out-of-bounds.
  2. the data in a and b must not be copied-over into a python list or tuple but I want to provide __getitem__ methods instead. The reason for that is that the actual struct consists of many such arrays and they get really huge and I don't want to waste any memory by duplicating them.

I've seen examples how to accomplish this with fixed-sized arrays by writing wrapper classes and templates for each member that internally store the size/length of each array individually, e.g.: SWIG interfacing C library to Python (Creating 'iterable' Python data type from C 'sequence' struct) and SWIG/python array inside structure. However, I assume once I would wrap a and b into a class to enable them to be extended with __getitem__ methods, I won't have access to the length member of mystruct, i.e. the 'container' of a and b.

One thing I tried without success was to write explicit _get and _set methods

typedef struct {
  size_t length;
} mystruct;

%extend mystruct {
 int *a;
};

%{
  int *mystruct_a_get(mystruct *s) {
    return mx->a;
  }
  int *mystruct_b_get(mystruct *s) {
    return mx->b;
  }
  ...
%}

But here, the entire arrays a and b would be returned without any control of the maximum index...

My target languages are Python and Perl 5, so I guess one could start writing complicated typemaps for each language. I've done that before for other wrappers and hope there is a more generic solution to my situation that involves only C++ wrapper classes and such.

Any help or idea is appreciated!

Edit for possible solution

So, I couldn't let it go and came up with the following (simplified) solution that more or less combines the solutions I already saw elsewhere. The idea was to redundantly store the array lengths for each of the wrapped arrays:

%{
/* wrapper for variable sized arrays */
typedef struct {
  size_t length;
  int    *data;
} var_array_int;

/* convenience constructor for variable sized array wrapper */
var_array_int *
var_array_int_new(size_t length,
                  int    *data)
{
  var_array_int *a = (var_array_int *)malloc(sizeof(var_array_int));
  a->length        = length;
  a->data          = data;

  return a;
}

/* actual structure I want to wrap */
typedef struct {
  size_t length;
  int    *a;
  int    *b;
} mystruct;
%}

/* hide all struct members in scripting language */
typedef struct {} var_array_int;
typedef struct {} mystruct;

/* extend variable sized arrays with __len__ and __getitem__ */
%extend var_array_int {
  size_t __len__() const {
    return $self->length;
  }

  const int __getitem__(int i) const throw(std::out_of_range) {
    if ((i < 0) ||
        (i >= $self->length))
      throw std::out_of_range("Index out of bounds");

    return $self->data[i];
  }
};

/* add read-only variable sized array members to container struct */
%extend mystruct {
  var_array_int *const a;
  var_array_int *const b;
};

/* implement explict _get() methods for the variable sized array members */
%{
  var_array_int *
  mystruct_a_get(mystruct *s)
  {
    return var_array_int_new(s->length, s->a);
  }

  var_array_int *
  mystruct_b_get(mystruct *s)
  {
    return var_array_int_new(s->length, s->b);
  }
%}

The above solution only provides read access to the variable sized arrays and does not include any NULL checks for the wrapped int * pointers. My actual solution of course does that and also makes use of templates to wrap variable sized arrays of different types. But I refrained from showing that here for the sake of clarity.

I wonder if there is an easier way to do the above. Also the solution only seems to work in Python so far. Implementing something similar for Perl 5 already gives me a headache.

RaumZeit
  • 1
  • 2
  • On the C side of things: Your struct does not "contain the arrays"; and its use cannot be safe, since the pointers may be invalid. Only through the regimentation of the creation and use of these structs can you achieve safety. – einpoklum Mar 29 '22 at 08:36
  • That's true, I certainly used the wrong vocabulary here. `mystruc` contains the pointers `a` and `b` to the memory of my arrays and not the arrays themselves. My implementation in `C` that uses the struct usually makes sure that all members are properly initialized. So `a` and `b` either point to `NULL` or some actual array data. But thanks for the remark! – RaumZeit Mar 29 '22 at 08:48

0 Answers0