24
>>> rows = [['']*5]*5
>>> rows
[['', '', '', '', ''], ['', '', '', '', ''], ['', '', '', '', ''], ['', '', '', '', ''], ['', '', '', '', '']]
>>> rows[0][0] = 'x'

Naturally, I expect rows to become:

[['x', '', '', '', ''], ['', '', '', '', ''], ['', '', '', '', ''], ['', '', '', '', ''], ['', '', '', '', '']]

Instead, I get:

[['x', '', '', '', ''], ['x', '', '', '', ''], ['x', '', '', '', ''], ['x', '', '', '', ''], ['x', '', '', '', '']]

It seems that elements of rows list are pointers to the same old ['']*5 list. Why does it work this way and is this a Python feature?

xyzman
  • 553
  • 5
  • 18
  • 1
    As a side note, if I create list through list comprehension syntax, I get the "properly working one": `rows = [['' for x in range(5)] for y in range(5)]` – xyzman Jan 11 '12 at 16:19
  • 1
    This also "works": `rows = [['']*5 for y in range(5)]` – xyzman Jan 11 '12 at 16:21

3 Answers3

25

The behaviour is not specific to the repetition operator (*). For example, if you concatenate two lists using +, the behaviour is the same:

In [1]: a = [[1]]

In [2]: b = a + a

In [3]: b
Out[3]: [[1], [1]]

In [4]: b[0][0] = 10

In [5]: b
Out[5]: [[10], [10]]

This has to do with the fact that lists are objects, and objects are stored by reference. When you use * et al, it is the reference that gets repeated, hence the behaviour that you're seeing.

The following demonstrates that all elements of rows have the same identity (i.e. memory address in CPython):

In [6]: rows = [['']*5]*5

In [7]: for row in rows:
   ...:     print id(row)
   ...:     
   ...:     
15975992
15975992
15975992
15975992
15975992

The following is equivalent to your example except it creates five distinct lists for the rows:

rows = [['']*5 for i in range(5)]
Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
NPE
  • 486,780
  • 108
  • 951
  • 1,012
8

The fact that names, function parameters, and containers have reference semantics is a very basic design decision in Python. It affects the way Python works in many aspects, and you picked just one of these aspects. In many cases, reference semantics are more convenient, while in other cases copies would be more convenient. In Python, you can always explicitly create a copy if needed, or, in this case, use a list comprehension instead:

rows = [[''] * 5 for i in range(5)]

You could design a programming language with different semantics, and there are many languages that do have different semantics, as well as languages with similar semantics. Why this decision was made is a bit hard to answer -- a language just has to have some semantics, and you can always ask why. You could as well ask why Python is dynamically typed, and in the end the answer is that this is just was Guido decided way back in 1989.

Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
  • I know this is very old but.. how do i do this without getting a warning that i is unused? – T_01 Apr 13 '18 at 21:57
  • @T_01 Python doesn't give you such warning – it must come from your IDE or linter or whatever. And in general, a warning that a variable is unused disappears if you, well, use it for something. – Sven Marnach Apr 14 '18 at 16:25
  • i found a way, just use for _ in range(5) – T_01 Apr 20 '18 at 00:58
  • @T_01 Yeah, many linters accept `_` as an unused variable. I really dislike that pattern, since it tends to make people think that `_` is special in some way, which it isn't. I recommend using `unused` or `dummy` or just `i` instead, and cajole your linter into accepting it. You shouldn't write worse code just because your linter tells you so. – Sven Marnach Apr 20 '18 at 10:15
  • what is a linter – T_01 Apr 26 '18 at 20:01
  • @T_01 https://stackoverflow.com/q/8503559/279627 – Sven Marnach Apr 27 '18 at 09:09
5

You are correct that Python is using pointers "under the hood", and yes, this is a feature. I don't know for sure why they did it this way- I assume it was for speed and to reduce memory usage.

This issue is, by the way, why it is critical to understand the distinction between shallow copies and deep copies.

Jim Clay
  • 963
  • 9
  • 24