10

If b is a 2x2 np.ndarray and the following assignment is performed, what does numpy do in the background, i.e. does it convert the list[100, 100] first to a numpy array or does it directly use the list[100,100] to fill in the values in the first row of b:

 b[1,:] = [100,100]

Where in the documentation can I find more information about this?

methane
  • 667
  • 2
  • 8
  • 17
  • `numpy` is open source, if you are interested in how something is *implemented* just look at the sources. – Bakuriu May 13 '14 at 18:26
  • 2
    The answer is "it depends". Look in `core/src/multiarray/sequence.c` in Numpy distribution (`array_assign_slice`: https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/sequence.c#L91). In your example, I think the code will not convert the list to an array, but depending on other cases, it may. – Alok Singhal May 13 '14 at 18:30
  • @goncalopp yes i am concerned about memory usage. – methane May 13 '14 at 20:03
  • 9
    @Bakuriu I am well aware of that, but i am asking on stackoverflow on the off chance anyone has looked into this. Otherwise, any python related question can be answered by just looking at the source hence perhaps you think there is no need for a python tag here? – methane May 13 '14 at 20:18
  • @AlokSinghal Thanks! Do you know of any cases/seen examples where that could be the case? – methane May 13 '14 at 20:19
  • 2
    @methane I don't know -- you can look into the code but from a quick reading, it seems that a numpy array is created whenever destination dimensions are greater than the source dimensions: `a = numpy.empty((2,3)); a[:] = range(3)` for example. I could be wrong. – Alok Singhal May 13 '14 at 22:56

2 Answers2

3

To evaluate the speed of execution, we will use the timeit library.

import timeit
import numpy as np

setup = """
import numpy as np
tmp = np.empty(shape=(1, 100))
values = [i for i in xrange(100)]
"""

stmt1 = """tmp[0, :] = values"""
stmt2 = """
for i, val in enumerate(values):
    tmp[0, i] = val
"""

time1 = timeit.Timer(setup=setup, stmt=stmt1)
time2 = timeit.Timer(setup=setup, stmt=stmt2)

print "numpy way :", time1.timeit(number=100000)
print "Python way:", time2.timeit(number=100000)

You can test this and you will notice that numpy loops are twice faster :

- numpy way : 0.97758197784423828
- Python way: 2.1633858680725098

This is because there is a phase where the integers in values (which are unlimited integers) are converted into floats of 64 bits. In order to compare only the speed of the loops, the type conversion can be done preliminarily in the setup:

values = np.array([i for i in xrange(100)], dtype=np.float64)

Here is what I obtained :

numpy way : 0.131125926971
Python way: 2.64055013657

We notice that numpy loops are 20 times faster than Python loops.

You will find more information if you look for vectorized computations in Python ...

Community
  • 1
  • 1
Taha
  • 709
  • 5
  • 10
-2

b[1,:] = [100,100] is exactly the same as

b[1,0] = 100
b[1,1] = 100

It is however faster to execute as it uses compiled loops. (The second one needs to do a conversion to the ndarray dtype before attributing the values).

Taha
  • 709
  • 5
  • 10
  • 1
    Is this in the documentation somewhere? If so, could you please provide a link to where this is documented? Thanks – DJG May 21 '14 at 14:53
  • Yes, please let me know if it's somewhere in the documentation. What is a compiled loop? – methane May 23 '14 at 01:37
  • Hello, I am talking about the difference between interpreted language and compiled language. It is known that Python can be executed command by command, unlike C or Java that are compiled. It is also known that numpy is a sort of interface that works with Python but runs as a compiled program. An example about loops performance is given in the answer that follows. – Taha May 23 '14 at 15:53