0

In python I came across this strange phenomena while working with itertools groupby module.

In python, variable assignment means assigning the new variable its own memory instead of a pointer to the original memory (from my understanding if this is incorrect please let me know):

y = 7
x = y    
y = 9

x will still be 7

Yet when I was working with groupby module, I was using this module to group items that had the same key into one group. I wanted two groups since reiterating over the original group was useless as the memory would have already been modified. Example:

for key, group in groupby(rows, lambda x: x[0]):

    data = [thing[1] for thing in group] #accesses 1st attribute of element
    data2 = [thing[2] for thing in group] # would yield [] as group is empty

So I tried this instead:

for key, group in groupby(rows, lambda x: x[0]):
    #create a copy of group to reiterate over
    toup = group

    print toup #<itertools._grouper object at 0x1039a8850>
    print group #<itertools._grouper object at 0x1039a8850>

    data = [thing[1] for thing in group] #accesses 1st attribute of element
    data2 = [thing[2] for thing in toup]

data2 should access the 2nd item but yields [] since they both share the same memory

My question is why does this happen? Shouldn't assigning group to toup means toup would have a copy of groups memory at a different hex address location?

Also what can I do to circumvent this problem so I don't have to write two groupby iterations?

user2117728
  • 3
  • 1
  • 4
  • 1
    It depends on the type of the variable. Primitives like integers and strings are copied if you assign them to another variable, instance objects are not - the variable will become a reference to the instance object instead. Try "a= []; b= a; print(a is b)" - this will print True. – Aran-Fey Sep 04 '14 at 16:20
  • 2
    *"variable assignment means assigning the new variable its own memory instead of a pointer to the original memory"* - incorrect, that is not how Python names work (see e.g. http://nedbatchelder.com/text/names.html). Use `toup = group[:]` to create a copy. – jonrsharpe Sep 04 '14 at 16:23
  • 1
    @Rawing that is not true (for a start, Python doesn't really have primitives, and e.g. integers *are* instances); the difference is that e.g. integers are immutable, whereas e.g. lists are mutable. `a = 1; b = a; a is b` will *still* be `True`, it's just that `b += 1` won't affect `a` (as integers are immutable). – jonrsharpe Sep 04 '14 at 16:24

2 Answers2

3

You state:

In python, variable assignment means assigning the new variable its own memory instead of a pointer to the original memory (from my understanding if this is incorrect please let me know):

That is incorrect. Python names can have aspects that (at time) are like C variables and can also have aspects that (at times) are like C pointers. To try and say they are like one or the other is just confusing. Don't. Consider them as unique and idiomatic to Python.

Python 'variables' should more be thought of as names. More than one may refer to the same memory location even if you did not intend them to.

Example:

>>> y=7
>>> x=7
>>> x is y
True
>>> id(x)
140316099265400
>>> id(y)
140316099265400

And (due to interning, the following may be true. See PEP 237 regarding interning of short ints, but this is an implementation detail):

>>> x=9
>>> y=5+4
>>> x is y
True

The Python is operator returns True if the two are the same objects by comparing their memory address. The id function returns that address.

Consider as a final example:

>>> li1=[1,2,3]
>>> li2=[1,2,3]
>>> li1==li2
True
>>> li1 is li2
False

Even though li1 == li2, they have to be separate lists otherwise both would change if you change one, as in this example:

>>> li1=[1,2,3]
>>> li2=li1
>>> li1.append(4)
>>> li2
[1, 2, 3, 4]
>>> li1==li2
True
>>> li1 is li2
True

(Be sure to understand another classic mistake all Python programers will make sooner or later. This is caused by multiple references to a single mutable object and then expecting one reference to act like a single object.)

As jonrsharpe pointed out in the comments, read Ned Batchelders excellent Facts and myths about Python Names and Values or How to Think Like a Pythonista for more detailed overview.

Community
  • 1
  • 1
dawg
  • 98,345
  • 23
  • 131
  • 206
  • 1
    You should really mention interning on your second example - that wouldn't happen with `300 is 60*5` (or in all Python implementations - it's a CPython detail). – jonrsharpe Sep 04 '14 at 16:27
  • "Python names can have aspects that (at time) are like C variables and can also have aspects that (at times) are like C pointers" They are always like C pointers. There's no generic way to "copy" a variable by value. – Paul Draper Sep 11 '15 at 02:59
  • @PaulDraper: I probably am not understating your comment, but `a=[1,2,3]` and `b=a` are logically similar in abstraction in Python but quite different under the hood. – dawg Sep 11 '15 at 03:13
  • @dawg, in C, `a = b` will copy the value of `b` into `a`. There is no way to do this in Python; the only things you can "copy" are the pointers. (And all variables are pointers.) – Paul Draper Sep 11 '15 at 06:05
  • question for you: so why do numbers and booleans assigned to different names have the same memory address (your first example) but tuples, which are also immutable will be assigned to different addresses e.g. x = 1, y = 1, id(x) != id(y) (resolves to true). also, when first starting the interpreter I noticed that trying it with lists actually yielded the same shared address... is that an optimization that depends on the current state of memory? – trad Nov 01 '17 at 16:51
  • Interning is completely implementation dependent. If you want to know 'why' on a lot of these things, refer to the relevant source. There can be substantial changes version to version and platform to platform. In general, it is probably a bigger return on effort to intern strings and integers and not worry about tuples. Note that the internal objects of tuples ARE often interned. Try `t=(1,2,3); t1=(1,2,3); id(t[0])==id(t1[0])` and that is likely `True` even if `id(t)==id(t1)` is not. – dawg Nov 01 '17 at 17:11
0

In python, variable assignment means assigning the new variable its own memory instead of a pointer to the original memory

Python has mutable (e.g. lists, iterators, just about everything) and immutable objects (e.g. integers and strings). Assignment does not copy the object in either case. With immutable objects, all operations on them result in a new instance, so you won't run into the problem of "modifying" an integer or a string like you do with mutable types.

My question is why does this happen? Shouldn't assigning group to toup means toup would have a copy of groups memory at a different hex address location?

Both variables will point to the same object. When you iterate over one and exhaust the iterator, iterating over the second variable will give you an empty sequence.

Blender
  • 289,723
  • 53
  • 439
  • 496