3

I have cython code I'm using to speed up a bottleneck in an otherwise pure python calculation. I have a pair of points in a periodic box of length Lbox (1d case is fine for this question). I need to compute the sign of y-x, and I need to flip this sign when the periodic boundary conditions are operative.

In the absence of PBCs, the following question provides the solution: Is there a standard sign function (signum, sgn) in C/C++?. That solution is what I use to compute the sgn function.

def sgn(x, y):
    """ If x > y, returns 1. 
    If x < y, returns -1. 
    If x == y, returns 0. 
    """
    return (y < x) - (x < y)

The sgn_pbc function as written below returns the correct result, but is written inefficiently: the control flow within the sgn_pbc is the culprit for slowing down the PBC version. How can I write sgn_pbc in an analogous way to the sgn function, so that I avoid the clumsy control flow?

def sgn_pbc(x, y, Lbox):
    d = abs(x-y)
    if d <= Lbox/2.:
        return sgn(x, y)
    else:
        return -1*sgn(x, y)
Community
  • 1
  • 1
aph
  • 1,765
  • 2
  • 19
  • 34
  • In `sgn`, why aren't you using `np.sign(y-x)`? – tom10 Jun 26 '15 at 15:39
  • For a pair of floats, the sgn function above is about 400% faster than np.sign – aph Jun 26 '15 at 16:01
  • 1
    You tagged this with `numpy`. Often the best way to speed things up is to vectorize the code, which is what numpy is all about. Are you interested in vectorized solutions, where `x` and `y` are float *arrays*, or do you want to stick with "pairs of floats"? – tom10 Jun 26 '15 at 16:17
  • Are these `python` or `cython` functions? I see `def` but not `cdef`? – hpaulj Jun 26 '15 at 23:38

1 Answers1

1

First,

-1*sgn(x, y) == sgn(y, x)

then,

def sgn_pbc(x, y, Lbox):
    d = abs(x-y)
    if d <= Lbox/2.:
        return sgn(x, y)
    else:
        return sgn(y, x)

Also in Python, function calls are the most expensive operations. You can inline your sgn.

def sgn_pbc(x, y, Lbox):
    d = abs(x-y)
    if d <= Lbox/2.:
        return (y < x) - (x < y)
    else:
        return (x < y) - (y < x)

But then the if can be (mostly) rewritten as:

def sgn_pbc(x, y, Lbox):
    d = abs(x-y)
    w = sgn(Lbox/2., d)
    return  w * sgn(x, y)

And again, inlining the sgn,

def sgn_pbc(x, y, Lbox):
    d = abs(x-y)
    w = sgn(Lbox/2., d)
    return  w * (y < x) - (x < y)

I say mostly because the case where d == Lbox/2. this returns a wrong value.

Haven't timed it, though.

vz0
  • 32,345
  • 7
  • 44
  • 77
  • This was precisely the point where I got stuck: this approach would still require a control flow to correctly handle cases where d==Lbox/2, so there is no speedup at all. – aph Jun 26 '15 at 16:56
  • Are those floating point values? Because you can add +eps to have a very low probability (1/2**53) of that equality being true. – vz0 Jun 26 '15 at 17:05
  • Unless I misunderstand something, no matter what epsilon is added, the returned result will be incorrect when the distance exactly equals the Lbox/2+epsilon. – aph Jun 26 '15 at 17:13
  • Yes. How probable is that? – vz0 Jun 26 '15 at 17:19