What's the difference between "()" and "[]" when generating in Python?

Question

There is a list: nodes = [20, 21, 22, 23, 24, 25].

I used two ways to generate new 2-dimentional objects:

tour1 = (((a,b) for a in nodes )for b in nodes)
tour2 = [[(a,b) for a in nodes ]for b in nodes]

The type of tour1 is a generator while tour2 is a list:

In [34]: type(tour1)
Out[34]: <type 'generator'>

In [35]: type(tour2)
Out[35]: <type 'list'>

I want to know why tour1 is not a tuple? Thanks.

Because it's a generator. What's your question? – Daniel Roseman Nov 21 '12 at 10:51 — Daniel Roseman, Nov 21 '12 at 10:51
@Daniel The part before the questionmark. Why not a tuple? – RickyA Nov 21 '12 at 10:52 — RickyA, Nov 21 '12 at 10:52

score 10 · Answer 1 · answered Nov 21 '12 at 10:51

10

The syntax for a tuple is not parentheses (), it's the comma ,. You can create a tuple without parentheses:

x = 1, 2, 3

If you want to create a tuple from a comprehension, just use the tuple constructor:

tuple(tuple((a,b) for a in nodes )for b in nodes)

answered Nov 21 '12 at 10:51

ecatmur

152,476
27
293
366

Thanks. I misunderstood the usage of tuple. Then generate like this: tour1 = tuple(((a,b) for a in nodes )for b in nodes). And tour1 is a tuple! – zfz Nov 21 '12 at 10:55

score 10 · Accepted Answer · edited May 23 '17 at 11:50

The fundamental difference is that the first is a generator expression, and the second is a list comprehension. The former only yields elements as they are required, whereas the latter always produces the entire list when the comprehension is run.

For more info, see Generator Expressions vs. List Comprehension

There is no such thing as a "tuple comprehension" in Python, which is what you seem to be expecting from the first syntax.

If you wish to turn tour1 into a tuple of tuples, you could use the following:

In [89]: tour1 = tuple(tuple((a,b) for a in nodes )for b in nodes)

In [90]: tour1
Out[90]: 
(((20, 20), (21, 20), (22, 20), (23, 20), (24, 20), (25, 20)),
 ((20, 21), (21, 21), (22, 21), (23, 21), (24, 21), (25, 21)),
 ((20, 22), (21, 22), (22, 22), (23, 22), (24, 22), (25, 22)),
 ((20, 23), (21, 23), (22, 23), (23, 23), (24, 23), (25, 23)),
 ((20, 24), (21, 24), (22, 24), (23, 24), (24, 24), (25, 24)),
 ((20, 25), (21, 25), (22, 25), (23, 25), (24, 25), (25, 25)))

I've read the question about Generator Expressions vs. List Comprehension. I want to make sure that the expression 'list(tuple((a,b) for a in nodes )for b in nodes)' is more efficient than '[[(a,b) for a in nodes ]for b in nodes]'? — zfz, Nov 21 '12 at 11:17

score 5 · Answer 3 · answered Nov 21 '12 at 10:53

5

Because the syntax (x for x in l) is a so called "generator expression": see http://docs.python.org/2/reference/expressions.html#generator-expressions

answered Nov 21 '12 at 10:53

Jonathan Ballet

973
9
21

score 2 · Answer 4 · answered Nov 21 '12 at 10:52

2

It's generator, but you can simply change it to tuple:

>>> (i for i in xrange(4))
<generator object <genexpr> at 0x23ea9b0>
>>> tuple(i for i in xrange(4))
(0, 1, 2, 3)

answered Nov 21 '12 at 10:52

applicative_functor

4,926
2
23
34

pepr · Answer 5 · 2012-11-21T14:24:02.810

To add... Actually, the generator expression does not need the parentheses at all. You only need them when the generator expression produces wrong syntax -- here because of the assignment. When passing the generator to a function (or the like), you do not need the parentheses. Try the following:

tour3 = list(list((a,b) for a in nodes) for b in nodes)

It produces exactly the same result as your tour2. This way, you can look at the [ as at a syntactic sugar for list(, and the ] is the syntactic sugar related to the ). However, it is compiled differently by the compiler. You can try to disassembly (you need to pass a function):

>>> import dis
>>> def fn1():
...   return list(list((a,b) for a in nodes) for b in nodes)
...
>>> def fn2():
...   return [[(a,b) for a in nodes ]for b in nodes]
...
>>> dis.dis(fn1)
  2           0 LOAD_GLOBAL              0 (list)
              3 LOAD_CONST               1 (<code object <genexpr> at 000000000229A9B0, file "<stdin>", line 2>)
              6 MAKE_FUNCTION            0
              9 LOAD_GLOBAL              1 (nodes)
             12 GET_ITER
             13 CALL_FUNCTION            1
             16 CALL_FUNCTION            1
             19 RETURN_VALUE
>>> dis.dis(fn2)
  2           0 BUILD_LIST               0
              3 LOAD_GLOBAL              0 (nodes)
              6 GET_ITER
        >>    7 FOR_ITER                37 (to 47)
             10 STORE_FAST               0 (b)
             13 BUILD_LIST               0
             16 LOAD_GLOBAL              0 (nodes)
             19 GET_ITER
        >>   20 FOR_ITER                18 (to 41)
             23 STORE_FAST               1 (a)
             26 LOAD_FAST                1 (a)
             29 LOAD_FAST                0 (b)
             32 BUILD_TUPLE              2
             35 LIST_APPEND              2
             38 JUMP_ABSOLUTE           20
        >>   41 LIST_APPEND              2
             44 JUMP_ABSOLUTE            7
        >>   47 RETURN_VALUE

So you can see it is different (i.e. looks like a syntactic sugar, but it is not). Unfortunately, Python does not know how to disassembly a generator:

>>> g = (list((a,b) for a in nodes) for b in nodes)
>>> dis.dis(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\dis.py", line 49, in dis
type(x).__name__
TypeError: don't know how to disassemble generator objects

Update:

One may be tempted--when looking at the above disassembled code--that the fn1 is faster (having the shorter code). However, it is the case of all function calls in all languages that the function call looks shorter that the unfolded code. It says nothing about the internals of the called code. Some points of The Zen of Python:

>>> import this
The Zen of Python, by Tim Peters
...
Readability counts.
...
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
...
>>>

There is the timeit standard module to measure the execution time. Let's try to use it for the two cases:

>>> import timeit
>>> t = timeit.Timer('list(list((a,b) for a in nodes) for b in nodes)',
...                  'nodes = [20, 21, 22, 23, 24, 25]')
>>> print("%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000))
17.74 usec/pass

and now with the square brackets:

>>> t = timeit.Timer('[[(a,b) for a in nodes ]for b in nodes]',
...                  'nodes = [20, 21, 22, 23, 24, 25]')
>>> print("%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000))
7.14 usec/pass
>>>

This clearly shows that making the list of lists via [ ] is faster. The reason is that there is less function calls. The Python compiler can produce more straightforward code.

Thanks. I'm not familiar with the module 'dis'. So I guess that 'fn1' is more efficient than 'fn2'? — zfz, Nov 21 '12 at 13:06
@zfz: No. It only shows that the code is different. Actually, the `fn1` code is only a wrapper to call the generator functionality, and getting the result out of it. The `fn2` is more inline, more direct. I am going to update my answer. — pepr, Nov 21 '12 at 14:13

What's the difference between "()" and "[]" when generating in Python?

5 Answers5