1

I am playing around with using numpy to simulate N-dimensional space. Note that I'm not seriously trying to make something that is efficient compared to existing software, I'm more just looking to learn something here.

That said, I'm still curious about the best way to design this algorithm.

Spatial simulation tends to require quite a lot of normalization calculations.

So, lets suppose that, to process 1 second of simulation, the computer needs to do 100 normalization calculations.

Numpy is capable of normalizing a large number of vectors at once. And I am guessing that it would be much faster to run one calculation of 100 norms then it would be to run 100 calculations for 1 norm each.

Would it make sense to keep a global list of "vectors to normalize", and then process them all at once at the end of each second of simulation? Or are the benefits of that approach not really significant?

I am guessing that this depends on exactly how much overhead is associated with running the calculations. Am I on the right track here?

trevorKirkby
  • 1,886
  • 20
  • 47

1 Answers1

2

Without performing any timing tests (which you should definitely do yourself), I would say that it would be faster to accumulate all vectors into a larger array and then process all of them with a single call to numpy's norm function. I suspect that the time used for assigning these vectors to the corresponding elements of the accumulator array is smaller than the overhead used by numpy's norm. This is all based on my "gut feeling" and a timing test should be performed.

Timing Tests:

In [1]: import numpy as np

In [2]: def f1(vectors):
   ...:     vectors = np.asarray(vectors, dtype=np.float64)
   ...:     acc = np.empty_like(vectors)
   ...:     for k in range(len(vectors)): # just force standard for loop
   ...:         acc[k, :] = vectors[k]
   ...:     return np.linalg.norm(acc, axis=1)
   ...: 

In [3]: def f2(vectors):
   ...:     vectors = np.asarray(vectors, dtype=np.float64)
   ...:     norm = np.empty(len(vectors), dtype=np.float64)
   ...:     for k in range(len(vectors)): # just force standard for loop
   ...:         norm[k] = np.linalg.norm(vectors[k])
   ...:     return norm
   ...: 

In [4]: v = np.random.random((1000, 3))

In [5]: %timeit f1(v)
953 µs ± 16.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [6]: %timeit f2(v)
4.02 ms ± 89.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

So, it seems that this particular test shows that I was right and that accumulating all vectors into a larger matrix and running numpy.linalg.norm() on all vectors at once is more efficient (by a factor of 4x for this particular example, number of vectors, etc.)

AGN Gazer
  • 8,025
  • 2
  • 27
  • 45
  • Hmmm I will do some further tests, but this seems to be what I am looking for, thank you. I wasn't really sure that the approach I was considering was reasonable or not. For other people reading this though, I will point out that linalg.norm probably shouldn't be used if speed is the objective. This question shows a number of faster alternatives: https://stackoverflow.com/questions/2850743/numpy-how-to-quickly-normalize-many-vectors – trevorKirkby Oct 29 '17 at 18:12
  • @someone-or-other Thank you for accepting my answer. Definitely one can normalize vectors in many ways - I just picked one for illustration. I highly recommend that you run some benchmarks for your particular python/numpy versions as performance of various methods of computing norm may change with version. And so the conclusion of my little experiment may change if you use a different method of computing norm. For example if a different method has a very small call overhead, you will not see a factor of 4 performance improvement but something smaller. – AGN Gazer Oct 30 '17 at 04:35
  • I certainly need to run more tests, yes. I guess I was just asking if the approach I was proposing was sensible, or if there was something basic that I was missing. – trevorKirkby Oct 30 '17 at 18:51