3

We are developing a project where we create a lot of typed memory views as numpy arrays but we often have to deal with arrays of small size so that the creation of memory views as np.ndarray is impacting a lot because it has lots of interactions with Python. I found that there are some “tricks” to improve the initialization step with Cpython arrays and the clone function for 1d numeric arrays (found on What is the recommended way of allocating memory for a typed memory view?), but we are stuck in generalizing it for numeric 2d arrays, is it possible to do that? If not, what can be another way to create 2d arrays faster than using np.ndarrays? We tried Cython arrays but they were even slower to initialize than numpy ones. Is it manually managing the memory the only solution? If so, can you give me some tips to do that in the 2d case? I found that the documentation on this is a bit limited in this respect for me since my knowledge of C is very little (but I'm trying to improve on this matter!). Thanks for the help in advance.

Tortar
  • 625
  • 5
  • 15
  • 2
    Posting as comment because I don't really have an interest in doing the profiling work that a good answer would require. But a few options to consider: 1) for `malloc` you allocate a 1D array of suitable size then just cast the pointer to a 2D memoryview (``). 2) For handle deallocation automatically [writing your own buffer protocol class](http://docs.cython.org/en/latest/src/userguide/buffer.html) could be a good idea. 3) Otherwise you could go `array` -> *Python* memoryview -> call `cast` to change the shape -> Cython memoryview. This might be too slow, but it could work. – DavidW Jan 28 '23 at 22:04
  • Given that you seem to be relatively new to something like this, have you looked for existing solutions that have the features you're after? Depending on what you require, interfacing with some other service from Python, or using a datastructure like NetCDF using existing and optimised libraries may be a far better choice than trying to roll your own. – Grismar Jan 28 '23 at 23:51
  • thanks @DavidW for the help, I tried the first solution but it seems like the casting of the pointer costs a lot (I did something like `cdef arrptr = malloc(sizeof(double) * L * 2); cdef memview[:, :] arrptr`, hope it's what you intended), also I'd like to ask how should I cast the shape? – Tortar Jan 29 '23 at 02:17
  • #Grismar I'm actually trying to optimize an already existing pure Python library through the use of Cython, so I don't think I can use something different (at least if I don't want to dive into something still more complicated :D) – Tortar Jan 29 '23 at 02:29
  • Digging a bit deeper the 1) way (and probably also the 3) one) can't work since it's the memview_malloc way described here:https://stackoverflow.com/questions/18462785/what-is-the-recommended-way-of-allocating-memory-for-a-typed-memory-view which has proven to be slow, applied to the 2d case – Tortar Jan 29 '23 at 11:33
  • For "3" I was proposing using https://docs.python.org/3/library/stdtypes.html#memoryview.cast – DavidW Jan 29 '23 at 12:46

0 Answers0