2

Which of the following should I use and why?

import numpy as np
a = np.zeros([2, 3])
b = np.zeros((2, 3))

There are many cases where you can pass arguments in either way, I just wonder if one is more Pythonic or if there are other reasons where one should be preferred over the other.

I looked at this question where people tried to explain the difference between a tuple and a list. That's not what I'm interested in, unless there are reasons I should care which I ignore of course!

UPDATE:

Although numpy was used as an example this pertains generally to python. A non numpy example is as follows:

a = max([1, 2, 3, 5, 4])
b = max((1, 2, 3, 5, 4))

I'm not editing the above because some answers use numpy in their explanation

Community
  • 1
  • 1
evan54
  • 3,585
  • 5
  • 34
  • 61
  • I think this is a great question for Stackoverflow. Thanks for asking it, plus one. – Russia Must Remove Putin Oct 24 '14 at 21:34
  • 1
    Unless this question is specific to `numpy` (which it doesn't appear to be), you would be better off using a generic example. – Air Oct 24 '14 at 21:35
  • 1
    Other than the generally insignificant slight overhead for preallocating for a list to possibly be extended, this seems like an excellent topic for bike-shedding. – ely Oct 24 '14 at 21:39
  • I see someone voted to close this as being opinion based, however, I have helped the asker to restate the question for objectiveness. I believe this is a great question, and I don't think it should be closed. – Russia Must Remove Putin Oct 24 '14 at 22:29
  • Sorry guys if the question was stated as opinion based. I tried to edit as best as could, thanks @AaronHall for the edits. Also you are right it's not numpy specific. – evan54 Oct 24 '14 at 23:25

2 Answers2

5

I'm answering this in the context of passing a literal iterable to a constructor or function beyond which the type does not matter. If you need to pass in a hashable argument, you need a tuple. If you'll need it mutated, pass in a list (so that you don't add tuples to tuples thereby multiplying the creation of objects.)

The answer to your question is that the better option varies situationally. Here's the tradeoffs.

Starting with list type, which is mutable, it preallocates memory for future extension:

a = np.zeros([2, 3])

Pro: It's easily readable.

Con: It wastes memory, and it's less performant.

Next, the tuple type, which is immutable. It doesn't need to preallocate memory for future extension, because it can't be extended.

b = np.zeros((2, 3))

Pro: It uses minimal memory, and it's more performant.

Con: It's a little less readable.

My preference is to pass tuple literals where memory is a consideration, for example, long-running scripts that will be used by lots of people. On the other hand, when I'm using an interactive interpreter, I prefer to pass lists because they're a bit more readable, the contrast between the square brackets and the parenthesis makes for easy visual parsing.

You should only care about performance in a function, where the code is compiled to bytecode:

>>> min(timeit.repeat('foo()', 'def foo(): return (0, 1)'))
0.080030765042010898
>>> min(timeit.repeat('foo()', 'def foo(): return [0, 1]'))
0.17389221549683498

Finally, note that the performance consideration will be dwarfed by other considerations. You use Python for speed of development, not for speed of algorithmic implementation. If you use a bad algorithm, your performance will be much worse. It's also very performant in many respects as well. I consider this only important insomuch as it may scale, if it can ameliorate heavily used processes from dying a death of a thousand cuts.

Russia Must Remove Putin
  • 374,368
  • 89
  • 403
  • 331
  • It probably more comes down to whether list primitives or tuple primitives are already the dominant usage pattern in some given context. I guess if you're designing NumPy itself, to work in a huge range of use cases, then go wth tuple. But if you're building a small extension to the API that your team within some company will use, and they already have a pretty big code base that tends to assume lists in a lot of places, it's probably better to use list so people don't have to construct tuples from their lists to use your API. I think that point dwarfs the slight memory overhead of lists. – ely Oct 24 '14 at 21:41
  • 1
    @MarkRansom thanks, but if that's a consideration, it should probably be the final consideration. – Russia Must Remove Putin Oct 24 '14 at 21:43
  • 2
    @AaronHall actually in the context of the question my comment is out of place, so I take it back. The question is if you're calling a function that takes a sequence parameter, should you use a list or a tuple? If the called function was going to put the parameter in a dictionary or a set it would insist on getting a tuple and you wouldn't have a choice. – Mark Ransom Oct 24 '14 at 21:46
  • @MarkRansom Precisely so. Well said. – Russia Must Remove Putin Oct 24 '14 at 21:46
0

If the number of items is known at design time (e.g coordinates, colour systems) then I would go with tuples, otherwise go with lists.

If I am writing an interface, my code will tend to just check for whether the argument is iterable or is a sequence (rather than checking for a specific type, unless the interface needs a specific type). I use the collections module to do my checks - it feels cleaner than checking for particular attributes.

Tony Suffolk 66
  • 9,358
  • 3
  • 30
  • 33