1

Hello I'm really beginner to Cython or C-based language. I have a problem to get a square of a vector.

I have a vector(each value is double type):

x = [1, 4, 9]

and I want to get:

y = [1, 2, 3]

How can I get this vector? A solution I thought is:

cdef floating[::1] y = x
for i in range(length):
    y[i] = x[i] ** 0.5

But in this way it's too slow. I want to acclerate this. Can I use sqrt or square function from libc.math in this case?


Edit:

If I want to get a vector like 1/3 root (like [1, 8, 27] -> [1,2,3]) what function should I use instead of sqrt?

steamhyeon
  • 11
  • 2
  • 1
    Note that `x[i]**0.5` has python interaction, so it's slow inside a loop. Instead, use the C function `sqrt` from libc.math, as you already mentioned. It's also highly recommended to type the index variable, i.e. `cdef int i`. – joni Apr 13 '22 at 11:07
  • @joni I believe `x[i]**0.5` should end up calling libc `pow` so should avoid Python interactions (assuming `x` and `i` are correctly typed so that the indexing works quickly). libc `sqrt` would probably be slightly better though. It should probably be `y[i] = y[i]**0.5` though (notice `y[i]` on the right-hand side) – DavidW Apr 13 '22 at 13:20

1 Answers1

0

Quick win

First you should check if your function is already implemented in Numpy. If so, it will probably be a very fast (C/C++) implementation.

This is the case for your function:

import numpy as np
x = np.array([1, 4, 9])
y = np.sqrt(x)
#> array([ 1, 2, 3])

With Numpy arrays

Alternatively (following @joni's comment), you can use np arrays just for the input/output and compute the function element-wise using C/C++:

cimport numpy as cnp
import numpy as np
from libc.math cimport sqrt

cpdef cnp.ndarray[double, ndim=1] cy_sqrt_np(cnp.ndarray[double, ndim=1] x):
   cdef Py_ssize_t i, l=x.shape[0]
   cdef np.ndarray[double, 1] y = np.empty(l)
   for i in range(l):
      y[i] = sqrt(x[i])
   return y

With C++ vectors

Lastly, here is a possible implementation with C++ vectors and automatic conversion from/to python lists:

from libc.math cimport sqrt
from libcpp.vector cimport vector

cpdef vector[double] cy_sqrt_vec(vector[double] x):
    cdef Py_ssize_t i, l = x.size()
    cdef vector[double] y
    y.reserve(l)
    for i in range(l):
        y.push_back(sqrt(x[i]))
    return y

Some things to keep in mind in this case and the previous:

  • We initialize the y vector to be empty, and then allocate space for it with reserve(). According to SO this seems to be a good option.
  • We use a typed i in the for loop, and use push_back to assign new values.
  • We use sqrt from libc.math to avoid using Python code inside the loop.
  • We type the input of the function to be vector[double]. This automatically adds convenient type conversions from other python types (e.g., list of ints).

Time comparison

We define a random input x to avoid cached results polluting our measures:

%%timeit  -n 10000 -r 7 x = gen_x()
y = np.sqrt(x)
#> executed in 177ms, finished 16:16:57 2022-04-19
#> 2.3 µs ± 241 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%%timeit -n 10000 -r 7 x = gen_x()
y = x**.5
#> executed in 194ms, finished 16:16:51 2022-04-19
#> 2.46 µs ± 256 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%%timeit  -n 10000 -r 7 x = gen_x()
y = cy_sqrt(x)
executed in 359ms, finished 16:17:02 2022-04-19
4.9 µs ± 274 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%%timeit  -n 10000 -r 7 x = list(gen_x())
y = cy_sqrt_vec(x)
executed in 2.85s, finished 16:17:11 2022-04-19
40.4 µs ± 1.31 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

As expected, the np.sqrt version wins. Besides, the vector allocation looks comparatively slower.

ibarrond
  • 6,617
  • 4
  • 26
  • 45
  • If you preallocate the vector's space, it's safe to use direct assignment `y[i] = sqrt(x[i])`. This should be faster than .push_back(). – joni Apr 14 '22 at 15:03
  • @joni do you know how to do that in Cython? I checked several StackOverflow questions & [vector.pxd](https://github.com/cython/cython/blob/master/Cython/Includes/libcpp/vector.pxd) but didn't find any satisfying answer. – ibarrond Apr 14 '22 at 15:25
  • Still, thanks to the `reserve` operation, the `push_back` doesn't need to reallocate memory. – ibarrond Apr 14 '22 at 15:32
  • Yeah, but `push_back()` checks whether it needs to reallocate memory whenever it's called, so it's definitely slower than a direct assignment. It's just `y[i] = sqrt(x[i])` since Cython wraps the vectors `operator[]`-method. That being said, I don't think it's needed to use a C++ vector here. A simple memoryview of a numpy ndarray should be enough. In addition, returning a np.ndarray instead of a python list is more convenient, IMO. – joni Apr 14 '22 at 16:10
  • The function isn't acclerated. Is there no function to calculate line by line? (not value by value) – steamhyeon Apr 15 '22 at 06:33
  • The trouble with the vector version is that you have a type conversion to/from a Numpy list for both the argument and the return value. – DavidW Apr 19 '22 at 16:32
  • @DavidW I know (and that's why it is so slow), yet it is the only way I can think of using only Python lists (I guess you meant *Python* and not *Numpy* lists), in case the user wanted to use only the Python standard library,. – ibarrond Apr 19 '22 at 18:16