18

I need to return the sin and cos values of every element in a large array. At the moment I am doing:

a,b=np.sin(x),np.cos(x)

where x is some large array. I need to keep the sign information for each result, so:

a=np.sin(x)
b=(1-a**2)**0.5

is not an option. Is there any faster way to return both sin and cos at once?

rylirk
  • 179
  • 2
  • 6
  • 1
    *sin* 90-*x* = *cos x* – tripleee Sep 04 '15 at 11:52
  • Did I understand your question correctly? Basically you are asking: If I already calculated `np.sin(x)`, can I use this information to get `cos(x)` faster than evaluating `np.cos(x)`? – cel Sep 04 '15 at 11:57
  • 12
    The OP is obliquely referring to the fact that some math libraries (and math hardware) have a [sincos](http://linux.die.net/man/3/sincos) function that simultaneously returns both the sin & cos of a given argument. So it's not unreasonable to wonder if numpy can do that, IMO. – PM 2Ring Sep 04 '15 at 11:59
  • 1
    you could use the tan(x) and retrieve cos(x) ans sin(x) using the common transformation function. But I don't know if it is faster, you should try it.... – vathek Sep 04 '15 at 12:00
  • 5
    It appears that numpy doesn't currently have a sincos function. See [Implement sincos()](https://github.com/numpy/numpy/issues/2626) – PM 2Ring Sep 04 '15 at 12:03
  • I don't know numpy. But if it does complex exponentiation you could use exp(it) = cos(t) + i.sin(t) – PM 2Ring Sep 04 '15 at 12:30
  • @cel yes, that is correct. – rylirk Sep 04 '15 at 12:32
  • @vathek I will try that, thanks! – rylirk Sep 04 '15 at 12:32
  • @vathek your method is indeed (slightly) faster! – rylirk Sep 04 '15 at 12:37
  • @PM2Ring complex exponentiation is unfortunately slower than just using sin and cos – rylirk Sep 04 '15 at 12:38
  • @rlink so add it as answer – vathek Sep 04 '15 at 12:40
  • @rlink another way could involve the precision you need and the data you have. If your data are (for ex.) just an array of N elements between 0 and 360, you could avoid to calculate sin and cos by mapping known values (sorry for my english) – vathek Sep 04 '15 at 14:51
  • I implemented a sincos() function in numpy using the function with the same name from the glibc library. So with sincos(x) I obtain two dimensional array as result - sin(x) in the first dimension and cos(x) in the second. Unfortunately this is not faster than calling np.sin(x) and np.cos(x) separately. I have no explanation for now. May be I should look in the glibc implementation of sincos() function... – Andrei Boyanov Sep 05 '15 at 18:32

7 Answers7

8

I compared the suggested solution with perfplot and found that nothing beats calling sin and cos explicitly.

enter image description here

Code to reproduce the plot:

import perfplot
import numpy as np


def sin_cos(x):
    return np.sin(x), np.cos(x)


def exp_ix(x):
    eix = np.exp(1j * x)
    return eix.imag, eix.real


def cos_from_sin(x):
    sin = np.sin(x)
    abs_cos = np.sqrt(1 - sin**2)
    sgn_cos = np.sign(((x - np.pi / 2) % (2 * np.pi)) - np.pi)
    cos = abs_cos * sgn_cos
    return sin, cos


b = perfplot.bench(
    setup=lambda n: np.linspace(0.0, 2 * np.pi, n),
    kernels=[sin_cos, exp_ix, cos_from_sin],
    n_range=[2**k for k in range(20)],
    xlabel="n",
)
b.save("out.png")
b.show()
Nico Schlömer
  • 53,797
  • 27
  • 201
  • 249
  • There are SIMD implementations of `sincos` that are way faster than calling sin and cos separately, however: http://gruntthepeon.free.fr/ssemath/. I note that this source says "The sincos_ps is nice because you get magically a free sine for each cosine you compute, so it is almost as fast as the sin_ps and the cos_ps." That means it would be nearly twice as fast. I hope numpy will implement something like this soon... – Daniel Aug 06 '22 at 00:43
4

You can use complex numbers and the fact that e i · φ = cos(φ) + i · sin(φ).

import numpy as np
from cmath import rect
nprect = np.vectorize(rect)

x = np.arange(2 * np.pi, step=0.01)

c = nprect(1, x)
a, b = c.imag, c.real

I'm using here the trick from https://stackoverflow.com/a/27788291/674064 to make a version of cmath.rect() that'll accept and return NumPy arrays.

This doesn't gain any speedup on my machine, though:

c = nprect(1, x)
a, b = c.imag, c.real

takes about three times the time (160μs) that

a, b = np.sin(x), np.cos(x)

took in my measurement (50.4μs).

Community
  • 1
  • 1
das-g
  • 9,718
  • 4
  • 38
  • 80
1

A pure numpy version via complex numbers, e = cosφ + i sinφ, inspired by the answer from das-g.

x = np.arange(2 * np.pi, step=0.01)

eix = np.exp(1j*x)
cosx, sinx = eix.real, eix.imag

This is faster than the nprect, but still slower than sin and cos calls:

In [6]: timeit c = nprect(1, x); cosx, sinx = cos(x), sin(x)
1000 loops, best of 3: 242 us per loop

In [7]: timeit eix = np.exp(1j*x); cosx, sinx = eix.real, eix.imag
10000 loops, best of 3: 49.1 us per loop

In [8]: timeit cosx, sinx = cos(x), sin(x)
10000 loops, best of 3: 32.7 us per loop
Community
  • 1
  • 1
Friedrich
  • 21
  • 1
1

For completeness, another way to combine this down to a single cos() call is to prepare an angle array where the second half has a phase shift of pi/2.

Borrowing the profiling code from Nico Schlömer, we get:

import perfplot
import numpy as np


def sin_cos(x):
    return np.sin(x), np.cos(x)


def exp_ix(x):
    eix = np.exp(1j * x)
    return eix.imag, eix.real


def cos_shift(x):
    angles = x[np.newaxis, :] + np.array(((-np.pi/2,), (0,)))
    return tuple(np.cos(angles))


perfplot.save(
    "out.png",
    setup=lambda n: np.linspace(0.0, 2 * np.pi, n),
    kernels=[sin_cos, exp_ix, cos_shift],
    n_range=[2 ** k for k in range(1, 16)],
    xlabel="n",
)

perfplot

So it's slower than the separate sin/cos calls, but in some (narrow) contexts might be more convenient because - from the cos() onward - it only needs to deal with a single array.

Reinderien
  • 11,755
  • 5
  • 49
  • 77
0

Using Numba you could improve speed by ~20%, if that matters to you.

import numba
import numpy as np

x = np.random.uniform((1000,))

@numba.njit
def sincos_simple(x):
    return np.sin(x), np.cos(x)

@numba.njit
def sincos_prealloc(x):
    r = np.empty(x.shape + (2,))
    r[..., 0] = np.sin(x)
    r[..., 1] = np.cos(x)
    return r

# compile numba function (run once before timing)
sincos_simple(x)
sincos_prealloc(x)

%timeit np.sin(x), np.cos(x)
%timeit sincos_simple(x)
%timeit sincos_prealloc(x)

Results:

1.02 µs ± 16.2 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
927 ns ± 13.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
819 ns ± 12.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
Yuval
  • 3,207
  • 32
  • 45
-1

You could take advantage by the fact that tan(x) contains both sin(x) and cos(x) function. So you could use the tan(x) and retrieve cos(x) ans sin(x) using the common transformation function.

vathek
  • 531
  • 2
  • 9
  • 1
    How do you get the correct sign information? Do you just check which quadrant x is in? – PM 2Ring Sep 04 '15 at 13:09
  • Very good point, I'd overlooked that this also loses sign information... if I am having to retrieve the quadrant of the angle, I can just use the alternative method in my original post. The problem remains; how to quickly determine the quadrant of all elements in an array? – rylirk Sep 04 '15 at 14:28
  • retrieving the sign do lost the advantage of this tecnique, I suppose – vathek Sep 04 '15 at 14:46
-1
def cosfromsin(x,sinx):
   cosx=absolute((1-sinx**2)**0.5)
   signx=sign(((x-pi/2)%(2*pi))-pi)
   return cosx*signx

a=sin(x)
b=cosfromsin(x,a)

I've just timed this and it is about 25% faster than using sin and cos.

rylirk
  • 179
  • 2
  • 6
  • What did you actually time? And how big was the array `x` when you timed it? When I precompute `sinx` and compare the timing of `cos(x)` and `cosfromsin(x, sinx)`, `cosfromsinx` is slower. – Warren Weckesser Sep 04 '15 at 16:34
  • That is what I did as well. The object I passed to cos and cosfrom sin was a 2-dimensional numpy array with dimensions of roughly 2000 * 1000 – rylirk Sep 07 '15 at 12:25
  • This `cosfromsin(x)`, with `sin(x)` given, is slower ~4x than `cos(x)` here too for such array sizes (and even worse for small). Note: the `absolute` can be dropped. `signcos = (np.int_((x - pi_2) // pi) & 1) * 2 - 1` speeds up somewhat but still won't beat. – kxr Aug 02 '16 at 15:31
  • This is actually rather slow. – Nico Schlömer Jun 02 '20 at 20:28