6

I came across this problem by Jake VanderPlas and I am not sure if my understanding of why the result differs after importing the numpy module is entirely correct.

>>print(sum(range(5),-1)
>> 9
>> from numpy import *
>> print(sum(range(5),-1))
>> 10

It seems like in the first scenario the sum function calculates the sum over the iterable and then subtracts the second args value from the sum.

In the second scenario, after importing numpy, the behavior of the function seems to have modified as the second arg is used to specify the axis along which the sum should be performed.

Exercise number (24) Source - http://www.labri.fr/perso/nrougier/teaching/numpy.100/index.html

aamir23
  • 1,143
  • 15
  • 23
  • 1
    What exactly is your question? – Michael Lorton Sep 17 '16 at 22:54
  • 14
    ...and that's why you never do import * – YXD Sep 17 '16 at 23:00
  • 1
    The point of the exercise is that `np.sum` is different from the builtin `sum` - so a `*` can be dangerous if you aren't careful. – hpaulj Sep 18 '16 at 01:32
  • 1
    @Malvolio I was seeking clarity about this behavior. I inferred my reasons from the documentation for the two functions and their parameters, but wanted to know if I was correct or not. – aamir23 Sep 18 '16 at 22:29

2 Answers2

11

"the behavior of the function seems to have modified as the second arg is used to specify the axis along which the sum should be performed."

You have basically answered your own question!

It is not technically correct to say that the behavior of the function has been modified. from numpy import * results in "shadowing" the builtin sum function with the numpy sum function, so when you use the name sum, Python finds the numpy version instead of the builtin version (see @godaygo's answer for more details). These are different functions, with different arguments. It is generally a bad idea to use from somelib import *, for exactly this reason. Instead, use import numpy as np, and then use np.sum when you want the numpy function, and plain sum when you want the Python builtin function.

Warren Weckesser
  • 110,654
  • 19
  • 194
  • 214
  • Ya that was what I inferred from the documentation of the two functions, and I thought it was some kind of function overloading, but wanted to be sure. Thank you for the detailed explanation. – aamir23 Sep 18 '16 at 22:33
10

Only to add my 5 pedantic coins to @Warren Weckesser answer. Really from numpy import * does not overwrite the builtins sum function, it only shadows __builtins__.sum, because from ... import * statement binds all names defined in the imported module, except those beginning with an underscore, to your current global namespace. And according to Python's name resolution rule (unofficialy LEGB rule), the global namespace is looked up before __builtins__ namespace. So if Python finds desired name, in your case sum, it returns you the binded object and does not look further.

EDIT: To show you what is going on:

 In[1]: print(sum, ' from ', sum.__module__)    # here you see the standard `sum` function
Out[1]: <built-in function sum>  from  builtins

 In[2]: from numpy import *                     # from here it is shadowed
        print(sum, ' from ', sum.__module__)
Out[2]: <function sum at 0x00000229B30E2730>  from  numpy.core.fromnumeric

 In[3]: del sum                                 # here you restore things back
        print(sum, ' from ', sum.__module__)
Out[3]: <built-in function sum>  from  builtins

First note: del does not delete objects, it is a task of garbage collector, it only "dereference" the name-bindings and delete names from current namespace.

Second note: the signature of built-in sum function is sum(iterable[, start]):

Sums start and the items of an iterable from left to right and returns the total. start defaults to 0. The iterable‘s items are normally numbers, and the start value is not allowed to be a string.

I your case print(sum(range(5),-1) for built-in sum summation starts with -1. So technically, your phrase the sum over the iterable and then subtracts the second args value from the sum isn't correct. For numbers it's really does not matter to start with or add/subtract later. But for lists it does (silly example only to show the idea):

 In[1]: sum([[1], [2], [3]], [4])
Out[1]: [4, 1, 2, 3]               # not [1, 2, 3, 4]

Hope this will clarify your thoughts :)

Community
  • 1
  • 1
godaygo
  • 2,215
  • 2
  • 16
  • 33