4

I was building a bidimensional vector in Python, and since I wanted it to be all zero at first, and didn't wanted to use numpy, I tried this:

columns = 8
rows = 5
m = [[0]* (columns)] * (rows)
m[3][2] = 1
print m

And I got an unexpected behaviour:

>> [[0, 0, 1, 0, 0, 0, 0, 0], [0, 0, 1, 0, 0, 0, 0, 0], [0, 0, 1, 0, 0, 0, 0, 0], [0, 0, 1, 0, 0, 0, 0, 0], [0, 0, 1, 0, 0, 0, 0, 0]]

It looks like building the bidimensional array like this, makes each row a reference to a single row, so if writting in any of them, you are writing over all of them.

Maybe this sounds evident to some of you, but I got a little socked. Of course I can fix it using a different approach, but I am curious of why this is happening.

Can anyone explain? Why is this not happening if you build a simple array with [0] * size_of_array?

Roman Rdgz
  • 12,836
  • 41
  • 131
  • 207

3 Answers3

5

This is a common Python gothca. You are not creating rows inner lists, you're creating rows references to the same list.

Your code is equivalent to the following:

inner_list = [0] * columns
m = [inner_list] * rows

I would recommend building the rows without using the * operator. (You don't run into the issue with columns, since 0 is an int and ints are immutable objects.)

matrix = []
for row in rows:
    matrix.append([0] * columns)
HardlyKnowEm
  • 3,196
  • 20
  • 30
2

[0] * size_of_array creates a list which multiple references to 0. If you put another value into this list, it won't be affected.

As you noticed, [[]] * num creates a list which contains a reference to the same list over and over again. Of you change this list, the change is visible via all references.

>>> a = [0] * 10
>>> [id(i) for i in a]
[31351584L, 31351584L, 31351584L, 31351584L, 31351584L, 31351584L, 31351584L, 31351584L, 31351584L, 31351584L]
>>> 
>>> all(i is a[0] for i in a)
True

vs.

>>> a = [[]] * 10
>>> a
[[], [], [], [], [], [], [], [], [], []]
>>> [id(i) for i in a]
[44072200L, 44072200L, 44072200L, 44072200L, 44072200L, 44072200L, 44072200L, 44072200L, 44072200L, 44072200L]
>>> all(i is a[0] for i in a)
True

Same situation, but one thing is different:

If you do a[0].append(10), the effect is visible in all lists.

But if you do a.append([]), you add a clean, new list which isn't related to the others:

>>> a = [[]] * 10
>>> a
[[], [], [], [], [], [], [], [], [], []]
>>> a.append([])
>>> a[0].append(8)
>>> a
[[8], [8], [8], [8], [8], [8], [8], [8], [8], [8], []]
>>> a[-1].append(5)
>>> a
[[8], [8], [8], [8], [8], [8], [8], [8], [8], [8], [5]]
glglgl
  • 89,107
  • 13
  • 149
  • 217
1

When you do [[0] * 8] * 5, it doesn't create a list containing 5 references to new objects. It creates the [0] * 8 object (list) first, then assigns a reference to that single list to each element created by * 5.

It's equivalent to:

a = [ 0 ] * 8
b = [ a ] * 5
Craig
  • 4,268
  • 4
  • 36
  • 53