0

I'm working on a real time data processing program in python and for the data crunching section I'm using cython.

Unfortunately initializing a numpy array using np.zeros is rather slow. I found that creating the array once at the startup and then looping through the values and setting them to zero is faster than creating a new array.

@cython.wraparound(False)
@cython.boundscheck(False)            
cpdef zeros_int32(unsigned long long[::1] arr):
    cdef unsigned int x
    with nogil:
        for x in range(arr.shape[0]):
            arr[x] = 0

However I find this to be quite hacky. Is there a better way to initialize an array with all the values set to 0? perhaps using malloc and free? I know in C++ you can do something like:

uint_32t emptyArray[number_of_elements];

which creates the array with all values set to 0 but I can't seem to get it to work in cython.Once again I only care about the speed of the code so it doesn't matter if it looks messy or not very readable as long as it executes as fast as possible.

Any help would be much appreciated.

OM222O
  • 310
  • 2
  • 11
  • Your c++ code creates an uninitialized array I believe – DavidW Jan 12 '21 at 09:04
  • 1) `uint_32t emptyArray[number_of_elements];` isn't valid C/C++-code if `number_of_elements` isn't a constant. 2) It is only zero-initialized if `emptyArray` is a global variable. – ead Jan 12 '21 at 09:19
  • it is also unclear, which array you would like to initialize: a C-array, Cython's memory view, numpy array? What is the "fastest" depends on many things like size of memory and usage pattern, e.g. calloc (https://stackoverflow.com/q/1538420/5769463) is lightning fast for large buffers, but then you will pay it back when writing to the returned memory. – ead Jan 12 '21 at 09:31
  • https://stackoverflow.com/questions/18462785/what-is-the-recommended-way-of-allocating-memory-for-a-typed-memory-view https://stackoverflow.com/questions/26843535/initialise-cython-memoryview-efficiently – DavidW Jan 12 '21 at 10:15
  • @ead I did not go into details about the C code since that was not the main point and I was aware of the small details, but thank you for bringing them up. Regarding the main point: A numpy array is a "memory view" in cython (you can pass them directly and interchangeably). Same thing with a C array using any method. The initialization is the only important part as I have to start with a "blank frame" and do some processing and store the results, but it happens every frame, so it really adds up. – OM222O Jan 12 '21 at 12:13

0 Answers0