7

I have created a function that takes a list as a parameter. It shuffles the list, replaces the first element and returns the new list.

import random
firstList=["a","b","c","d","e","f","g","h","i"]

def substitution(importedList):
    random.shuffle(importedList)
    importedList[0]="WORD"
    return importedList

The shuffle has no impact on my question. However, I was surprised to see that the returned importedList overwrites the original firstList.

>>> firstList
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']

>>> substitution(firstList)
['WORD', 'a', 'b', 'd', 'i', 'c', 'g', 'e', 'h']

>>> firstList
['WORD', 'a', 'b', 'd', 'i', 'c', 'g', 'e', 'h']

I have found a workaround by copying the list within the function, but it seems inefficient.

import random
firstList=["a","b","c","d","e","f","g","h","i"]
string="a"

def substitutionandcopy(importedList):
    copiedList=importedList[:]
    random.shuffle(copiedList)
    copiedList[0]="WORD"
    return copiedList

My question is why does the function replace the firstList? This would not happen if it were a string for example.

string="a"

def substituteString(foo):
    foo='b'
    return foo

>>> string
'a'

>>> substituteString(string)
'b'

>>> string
'a'
Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
DaveOB
  • 81
  • 1
  • 4
  • 5
    1. `shuffle` changes the original mutable list. 2. Pass a `copy` of the list instead of the original. See [How to clone or copy a list in Python?](http://stackoverflow.com/q/2612802) – Bhargav Rao Feb 16 '16 at 11:52
  • 1
    Strings are immutable, lists are mutable. Pass a copy as @BhargavRao suggested. – Selcuk Feb 16 '16 at 11:52
  • 1
    It might be a taste thing, but I'd rather have the function copy the list and work on the copy rather than me have to pass in a copy. Might make the function a little more flexible with the types you could pass in. (For example, you could pass in a tuple instead of a list). – LexyStardust Feb 16 '16 at 12:01
  • @LexyStardust: That's why in my answer I suggested `random.sample()`. Since they were going to copy the list anyway, they gave you the choice of how much of the list to copy. – zondo Feb 16 '16 at 12:05
  • 1
    From Bhargav's link: "When you say `new_list = my_list` you're not making a copy, you're just adding another name that points at that original list in memory". The same goes for lists passed as function arguments. – PM 2Ring Feb 16 '16 at 12:06
  • 1
    (cont) You said "The shuffle has no impact on my question" but that's not quite correct. The `random.shuffle` method mutates its arg in place, otherwise you'd have to write `new_list = random.shuffle(old_list)`. Your string example is different because you're making an assignment, i.e., you're binding a new string object to the name `foo`. But the code in your 1st code block makes no assignments. – PM 2Ring Feb 16 '16 at 12:06
  • @LexyStardust: In general, I prefer a function to not make a copy of its args unless it _really_ needs to in order to run correctly. If the user wants the function to work on a copy they can explicitly pass it a copy. That way you have a choice, rather than being forced to have a copy created even when you don't need it. – PM 2Ring Feb 16 '16 at 12:09
  • 1
    You may find this article helpful: [Facts and myths about Python names and values](http://nedbatchelder.com/text/names.html), which was written by SO veteran Ned Batchelder. – PM 2Ring Feb 16 '16 at 12:12
  • @PM2Ring: you are very bold. I'm a coward, so I prefer the opposite code style: my functions should be pure (should not cause side effects) except if it really needs to in order to run correctly (for example, I/O). IMHO this way is safer, because it is harder to violate the principle of least astonishment. – Paulo Scardine Feb 16 '16 at 12:22
  • @PauloScardine: Fair point. OTOH, if I want pure code, I'll write Haskell, not Python. :) – PM 2Ring Feb 16 '16 at 12:26
  • 2
    If what you wanted was to make a copy, then making a copy explicitly clearly isn't _inefficient_ -- you can't do it without doing it, so to speak :-) – RemcoGerlich Feb 16 '16 at 12:29
  • My concern was stated by Paulo. I did want to return new list, a copy of the first, but with an element substituted. My concern was that couldn't understand why the list outside of the function was changed. I too prefer my functions to only return what I want and not have an external effect unless designed to do so. – DaveOB Feb 16 '16 at 15:21

4 Answers4

6

Strings, Ints, Tuples are immutable python types, so when you perform operations that change one of these types the new corresponding object is effectively created in memory each time. (Or you get an error if trying to change those in-place.)

Lists and dictionaries are mutable python types, so when you perform operations that change one of these types, the object stays the same, but it's parts (i.e., list elements) get changed.

So when you want to change a list, but want to leave the original intact you have to copy it yourself. Important thing, that there're two types of copying - shallow copy and deep copy.

Shallow copy can be done like so:

list_b = list_a[:] #using slice syntax

#or

list_b = list(list_a) #instantiating a new list from iterating over the old one

#or

import copy
list_b = copy.copy(list_a) #using copy module

Deep copy is done in the following way:

import copy
list_b = copy.deepcopy(list_a)

The difference between deep copy and shallow copy is...

When doing shallow copy, if mutable object contains other mutable objects, only the top one is copied. I.e. if a list contains other list, if top list is copied and then the inner list is changed in the copy, effectively the inner list will be changed both in the copy and in the original, because it's the same object in memory that is referenced in two different lists. Basicly shallow copy creates a new object with the same references stored in original object.

When doing deep copy, if mutable object contains other mutable objects, then inner mutable objects are copied too. I.e. as in previous example, if you change inner list in the copy, it changes only in the copy and the original is not affected. So deep copy copies everything, creates new structure in memory for everything in the object being copied, and not just references.

Nikita
  • 6,101
  • 2
  • 26
  • 44
  • @DaveOB No problem, sometimes Python is not that straightforward. :) If you find that my answer fits your question, you might want to mark it, or other answer, that fits -see http://stackoverflow.com/help/someone-answers. – Nikita Feb 17 '16 at 10:24
1

It does not replace the first list. The first list is passed by reference meaning that any mutations you perform on the list that is passed as the parameter, will also be performed on the list outside of the function, because it is the same list.

However, Strings and other basic types are not passed by reference and therefore any changes you make in your function scope is tot he local copy of the variable only.

Thijs Riezebeek
  • 1,762
  • 1
  • 15
  • 22
  • 3
    "Passed by reference" can be misleading terminology when discussing Python's data model. Please see [Facts and myths about Python names and values](http://nedbatchelder.com/text/names.html), which was written by SO veteran Ned Batchelder. – PM 2Ring Feb 16 '16 at 12:12
  • I think it is more correct to say that all variables in Python are references. Only some objects can be mutated "in place", so all references to that same address will reflect the change. This is not the case with strings in Python, they are immutable. – Paulo Scardine Feb 16 '16 at 12:12
  • Also see [How do I pass a variable by reference?](http://stackoverflow.com/q/986006/4014959) – PM 2Ring Feb 16 '16 at 12:16
  • 2
    Strings are _also_ passed by reference. The thing is that if he had tried to do the same thing he did with the list with the string, namely `random.shuffle(foo)` or `foo[0] = "w"`, he would have got an error, because string is immutable. If string were mutable, it would have been possible to do the exact same thing with strings. – RemcoGerlich Feb 16 '16 at 12:26
1

As you found out, random.shuffle mutates the list in place:

random.shuffle(x[, random])

Shuffle the sequence x in place. The optional argument random is a 0-argument function returning a random float in [0.0, 1.0); by default, this is the function random().

Note that for even rather small len(x), the total number of permutations of x is larger than the period of most random number generators; this implies that most permutations of a long sequence can never be generated.

Strings are immutable in Python, all string operations return a new string instead. This is the "string" example from your question:

string="a"

def substitute_string(foo):
    foo = 'b'
    return foo

It is not really akin to the code from the substitution list in the first code block of the question. The equivalent code using a list would be this:

alist = [1, 2, 3]

def substitute_list(foo):
    foo = [4, 5, 6]
    return foo

And it works identically:

>>> alist
[1, 2, 3]

>>> substitute_list(alist)
[4, 5, 6]

>>> alist
[1, 2, 3]

Back to your solution, it could be:

def substitution_and_copy(imported_list):
    imported_list = imported_list[:]
    random.shuffle(imported_list)
    imported_list[0]="WORD"
    return imported_list

And no, assigning a new value to the argument will not mutate the original list, the same way you don't mutate the original string when you assign a new value to foo (also changed camelCase to snake_case, I'm a little nazy about PEP8).

[update]

What you have now, however, is what he already tried. "I have found a workaround by copying the list within the function, but it seems inefficient"

A list copy is not as inefficient as you may think, but this is not the point: as someone else pointed out, either you mutate the list in place and return nothing or return a new list - you can't have your cake and eat it.

Paulo Scardine
  • 73,447
  • 11
  • 124
  • 153
  • `random.shuffle()` returns None. He will find some unexpected behavior if he defines `some_list` that way. – zondo Feb 16 '16 at 12:03
  • indeed, just spotted it. – Paulo Scardine Feb 16 '16 at 12:06
  • What you have now, however, is what he already tried. _I have found a workaround by copying the list within the function, but it seems inefficient._ – zondo Feb 16 '16 at 12:10
  • 1
    It is not as inefficient as you may think, because in Python each list element is just a reference, it will not duplicate the space for all strings. It is more like an array of void pointers in C. Unless it is in a tight loop you can afford this most of the time even for large lists. – Paulo Scardine Feb 16 '16 at 12:27
  • *I* didn't say it was inefficient. The OP is the one who wanted something different. – zondo Feb 16 '16 at 13:33
0

From the docs on random.shuffle(): shuffle list x in place; return None. If you don't want that, you can use random.sample():

def substitutionandcopy(importedList):
    shuffledList = random.sample(importedList, len(importedList))
    shuffledList[0]="WORD"
    return shuffledList
zondo
  • 19,901
  • 8
  • 44
  • 83