How to apply sqrt to vector in Cython?

Question

Hello I'm really beginner to Cython or C-based language. I have a problem to get a square of a vector.

I have a vector(each value is double type):

x = [1, 4, 9]

and I want to get:

y = [1, 2, 3]

How can I get this vector? A solution I thought is:

cdef floating[::1] y = x
for i in range(length):
    y[i] = x[i] ** 0.5

But in this way it's too slow. I want to acclerate this. Can I use sqrt or square function from libc.math in this case?

Edit:

If I want to get a vector like 1/3 root (like [1, 8, 27] -> [1,2,3]) what function should I use instead of sqrt?

Note that `x[i]**0.5` has python interaction, so it's slow inside a loop. Instead, use the C function `sqrt` from libc.math, as you already mentioned. It's also highly recommended to type the index variable, i.e. `cdef int i`. — joni, Apr 13 '22 at 11:07
@joni I believe `x[i]**0.5` should end up calling libc `pow` so should avoid Python interactions (assuming `x` and `i` are correctly typed so that the indexing works quickly). libc `sqrt` would probably be slightly better though. It should probably be `y[i] = y[i]**0.5` though (notice `y[i]` on the right-hand side) — DavidW, Apr 13 '22 at 13:20

ibarrond · Answer 1 · 2022-04-19T14:31:54.853

Quick win

First you should check if your function is already implemented in Numpy. If so, it will probably be a very fast (C/C++) implementation.

This is the case for your function:

import numpy as np
x = np.array([1, 4, 9])
y = np.sqrt(x)
#> array([ 1, 2, 3])

With Numpy arrays

Alternatively (following @joni's comment), you can use np arrays just for the input/output and compute the function element-wise using C/C++:

cimport numpy as cnp
import numpy as np
from libc.math cimport sqrt

cpdef cnp.ndarray[double, ndim=1] cy_sqrt_np(cnp.ndarray[double, ndim=1] x):
   cdef Py_ssize_t i, l=x.shape[0]
   cdef np.ndarray[double, 1] y = np.empty(l)
   for i in range(l):
      y[i] = sqrt(x[i])
   return y

With C++ vectors

Lastly, here is a possible implementation with C++ vectors and automatic conversion from/to python lists:

from libc.math cimport sqrt
from libcpp.vector cimport vector

cpdef vector[double] cy_sqrt_vec(vector[double] x):
    cdef Py_ssize_t i, l = x.size()
    cdef vector[double] y
    y.reserve(l)
    for i in range(l):
        y.push_back(sqrt(x[i]))
    return y

Some things to keep in mind in this case and the previous:

We initialize the y vector to be empty, and then allocate space for it with reserve(). According to SO this seems to be a good option.
We use a typed i in the for loop, and use push_back to assign new values.
We use sqrt from libc.math to avoid using Python code inside the loop.
We type the input of the function to be vector[double]. This automatically adds convenient type conversions from other python types (e.g., list of ints).

Time comparison

We define a random input x to avoid cached results polluting our measures:

%%timeit  -n 10000 -r 7 x = gen_x()
y = np.sqrt(x)
#> executed in 177ms, finished 16:16:57 2022-04-19
#> 2.3 µs ± 241 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%%timeit -n 10000 -r 7 x = gen_x()
y = x**.5
#> executed in 194ms, finished 16:16:51 2022-04-19
#> 2.46 µs ± 256 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%%timeit  -n 10000 -r 7 x = gen_x()
y = cy_sqrt(x)
executed in 359ms, finished 16:17:02 2022-04-19
4.9 µs ± 274 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%%timeit  -n 10000 -r 7 x = list(gen_x())
y = cy_sqrt_vec(x)
executed in 2.85s, finished 16:17:11 2022-04-19
40.4 µs ± 1.31 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

As expected, the np.sqrt version wins. Besides, the vector allocation looks comparatively slower.

If you preallocate the vector's space, it's safe to use direct assignment `y[i] = sqrt(x[i])`. This should be faster than .push_back(). — joni, Apr 14 '22 at 15:03
@joni do you know how to do that in Cython? I checked several StackOverflow questions & [vector.pxd](https://github.com/cython/cython/blob/master/Cython/Includes/libcpp/vector.pxd) but didn't find any satisfying answer. — ibarrond, Apr 14 '22 at 15:25
Still, thanks to the `reserve` operation, the `push_back` doesn't need to reallocate memory. — ibarrond, Apr 14 '22 at 15:32
Yeah, but `push_back()` checks whether it needs to reallocate memory whenever it's called, so it's definitely slower than a direct assignment. It's just `y[i] = sqrt(x[i])` since Cython wraps the vectors `operator[]`-method. That being said, I don't think it's needed to use a C++ vector here. A simple memoryview of a numpy ndarray should be enough. In addition, returning a np.ndarray instead of a python list is more convenient, IMO. — joni, Apr 14 '22 at 16:10
The function isn't acclerated. Is there no function to calculate line by line? (not value by value) — steamhyeon, Apr 15 '22 at 06:33
The trouble with the vector version is that you have a type conversion to/from a Numpy list for both the argument and the return value. — DavidW, Apr 19 '22 at 16:32
@DavidW I know (and that's why it is so slow), yet it is the only way I can think of using only Python lists (I guess you meant *Python* and not *Numpy* lists), in case the user wanted to use only the Python standard library,. — ibarrond, Apr 19 '22 at 18:16

How to apply sqrt to vector in Cython?

1 Answers1

Quick win

With Numpy arrays

With C++ vectors

Time comparison