5

Going through some source code in scikit-learn, I noticed in tree.pxdsome of the following type declarations:

import numpy as np
cimport numpy as np

ctypedef np.npy_float32 DTYPE_t          # Type of X
ctypedef np.npy_float64 DOUBLE_t         # Type of y, sample_weight
ctypedef np.npy_intp SIZE_t              # Type for indices and counters
ctypedef np.npy_int32 INT32_t            # Signed 32 bit integer
ctypedef np.npy_uint32 UINT32_t          # Unsigned 32 bit integer

I know there is some discussion on the Cython docs here about the difference between C types and cython types, but these seem to be types from numpy, and they aren't mentioned in the documentation.

I'm confused about what types I should be using. For indexes, should I be using SIZE_t as defined above, or unsigned int? Is it really necessary for these ctypedefs to exist?

hlin117
  • 20,764
  • 31
  • 72
  • 93
  • 1
    FYI: http://stackoverflow.com/questions/20987390/cython-why-when-is-it-preferable-to-use-py-ssize-t-for-indexing – Warren Weckesser Nov 17 '15 at 05:03
  • Thanks @WarrenWeckesser. Your post let me do this `__init__.pxd` file: https://github.com/cython/cython/blob/master/Cython/Includes/numpy/__init__.pxd#L325 – hlin117 Nov 17 '15 at 05:08

1 Answers1

5

According to this init.pxd file for cython's numpy, it seems that unsigned int is the same exact thing as npy_uint32.

On the other hand, npy_intp is the same thing as Py_intptr_t, according to this line of the file. And I'm pretty sure that means the size of pointers, which corresponds to the spacing between items in an array, etc.

I think based upon this discussion, without mention, Py_intptr_t is preferred, to accommodate between architecture differences. So to really nail down on architecture accommodation, npy_intp should be used.

Community
  • 1
  • 1
hlin117
  • 20,764
  • 31
  • 72
  • 93