I have a C++ class that provides an interface to data for a number of "particles" (the context is a physics simulation). The data for each particle are stored in a struct, and the class has an array of pointers to the structs. I really don't want to mess with this storage scheme because:
- The data are stored on disk in a binary format that is not of my own devising, and writing a new function to read the files into some other storage structure would not be straightforward.
- I have a wealth of other C/C++ code designed around the same data storage scheme that will be unusable or require a major overhaul if the storage structure is changed.
Now, I want to use python to do some visualization. The ideal scenario is having access to my data as numpy arrays so I can use a variety of numpy functions (histograms, sorts, binning, statistics, etc.). I have a working solution using SWIG to wrap my class into Python. The drawback is that I need to make partial copies of the data (from the structs buried in the C++ class into numpy arrays). As my work with these simulations progresses, I'm pushing to the limits imposed by my hardware, which means I want to push the number of particles up to where the data occupy a large fraction of the available memory. So making copies is to be avoided at all costs.
Is there a way to map a numpy array onto this mess of data? Some poking around seems to point to a "no" answer, but what if I relax my "no copy" requirement a bit and allow a bit of wiggle room to create an extra array of pointers? I'll sketch out what I'm thinking:
struct particle_data
{
double x[3];
double vx[3];
//more data
}
class Snap
{
struct particle_data *P; //this gets allocated, so data is accessed as P[i].x[j] and so on
//a bunch of other functions, flags, etc.
}
What I'm thinking is that I can create an array of pointers, e.g.
double **x0;
//of course allocate some memory for the array here...
for(int i=0; i<max; i++)
{
x0 = &P[i].x[0]
}
And hopefully somehow get this to play nicely in python as a numpy array of doubles. If I'm especially lucky it will be possible to avoid making similar arrays x1 and x2 since x0[i]+1 = x1[i] and x0[i]+2 = x2[i].
I have no idea if this is possible or how to set it up, though. In a perfect world I can stick with SWIG, but I have a hunch that this will involve writing some wrappers myself, if it's possible.