Cython: return double * heap array from cython function to python as a 1D np.ndarray

Question

I am trying to generate time series noise. My noise array are ~350,000 in size so they must be heap allocated. How can I return my heap allocated array back to the python code that called the function when the function is called? I have tried converting it to a nd.array but python complains during compilation.

I am new to cython and would love a real explanation on how this works. Everything I find online is super convoluted and not beginner friendly.

Also, if anyone know how to replace the np.random.normal call for a cython version that would be great. I can't find anything on the internet that is not super complex for a simple gaussian random number generator.

Here is my pyx file:

import numpy as np
cimport numpy as np

from libc.stdlib cimport malloc, free



# TODO turn this into a cython script. Generate function is slow for 350,000 samples
cdef class NoiseGenerator():

    cdef double mean, std, tau, start_value, previous_value
    cdef double* noise


    def __cinit__(self, double mean=0.0, double std=1.0, float tau=0.2, float start_y=0):
        self.mean = mean
        self.std = std
        self.tau = tau
        self.start_value = start_y


    cdef double sample_next(NoiseGenerator self, double dx):

        cdef double red_noise, wnoise 

        if self.previous_value is None:
            red_noise = self.start_value

        else:
            wnoise = np.random.normal(loc=self.mean, scale=self.std, size=1)
            red_noise = ((self.tau/(self.tau + dx)) * (dx*wnoise + self.previous_value))

        self.previous_value = red_noise
        return red_noise


    cdef np.array generate(NoiseGenerator self, double dx, int samples):

        self.noise = <double *>malloc(samples * sizeof(double))

        for idx in range(samples):

            noise[idx] = self.sample_next(dx)

        return nd.array(self.noise)


    def __dealloc__(NoiseGenerator self):

        free(self.noise)

from a python script this is how I want to use it:

import numpy as np
import pyximport; pyximport.install(setup_args={'include_dirs': np.get_include()})
from CNoiseGenerator import NoiseGenerator


mean = -2.5914938426209425e-06
std=0.00024610271604726564
dx=0.0018104225094430712
samples=352036

noise = NoiseGenerator(mean, std).generate(dx, samples)

Any advice on how to refactor my class would be greatly appreciated. Again still learning here!

In your case https://stackoverflow.com/a/60856020/5769463 is probably the best answer from the duplicate. However, you could ask yourself why don't you directly create a numpy array in the first place... — ead, Sep 07 '20 at 06:07
Also concerning other questions in your question: please don't ask multiple questions in one - ask different questions. For example if you need an explanation of a "super complex" random number generation example, make a own question which shows the example, so everybody knows what you are talking about, and highlight the parts you find not understandable. — ead, Sep 07 '20 at 06:11
I was making the Numpy Array in the first place but I was not getting a significant speed increase in comparison to the non-python version. I thought maybe creating the array as a heap allocated array would give me a speedup. — Danny Diaz, Sep 07 '20 at 14:44
Yeah. I will post the gaussian random number question as a separate question later. I want to do some more reading online first. Sorry about that. — Danny Diaz, Sep 07 '20 at 14:44

Cython: return double * heap array from cython function to python as a 1D np.ndarray

0 Answers0