8

In Python, I often reuse variables in manner analogous to this:

files = files[:batch_size]

I like this technique because it helps me cut on the number of variables I need to track.

Never had any problems but I am wondering if I am missing potential downsides e.g. performance etc.

juliomalegria
  • 24,229
  • 14
  • 73
  • 89
jldupont
  • 93,734
  • 56
  • 203
  • 318
  • I don't see what's the question here. What is the alternative to compare against? Using a second variable like `files = XYZ; files_head = files[:batch_size]`? Why should there be any difference? – Niklas B. Jan 29 '12 at 16:08
  • alternative being something like: new_set_of_files=files[:batch_size] – jldupont Jan 29 '12 at 16:09
  • You'd notice the main one right away: *Hey! I still need that old value for `files`!*. – yurisich Jan 29 '12 at 16:14
  • 2
    the alternative is to use tons of extra variables. `unused1 = files[0]`, `unused2 = 'foobar'`, 'unused3 = -1`, `veryunused = None`. Indeed, that doesn't make the code very readable. But someone might like it. **seriously, what is your question?** – Has QUIT--Anony-Mousse Jan 29 '12 at 16:15

4 Answers4

8

There is no technical downside to reusing variable names. However, if you reuse a variable and change its "purpose", that may confuse others reading your code (especially if they miss the reassignment).

In the example you've provided, though, realize that you are actually spawning an entirely new list when you splice. Until the GC collects the old copy of that list, that list will be stored in memory twice (except what you spliced out). An alternative is to iterate over that list and stop when you reach the batch_sizeth element, instead of finishing the list, or even more succinctly, del files[batch_size:].

cheeken
  • 33,663
  • 4
  • 35
  • 42
  • +1 Nice point about how you can avoid creating a new object (though perhaps it's less readable?) – RoundTower Jan 29 '12 at 16:28
  • 1
    @cheeken: Yet another alternative (probably the most Pythonic) is to create a generator with `itertools.islice`. – Niklas B. Jan 29 '12 at 16:35
  • Good Call, @NiklasBaumstark! To whomever downvoted: I would appreciate a comment explaining why so that I might correct it. – cheeken Jan 29 '12 at 17:43
5

Some info on that specific example: If you just want to iterate, map or filter the result, you can use a generator to avoid an array copy:

import itertools
files = itertools.islice(files, batch_size)

As for the general case: Whether you assign the new value to an already existing name or to a new name should make absolutely no difference (at least from the point of view of the interpreter/VM). Both methods produce almost the exact same bytecode:

Python 2.7.2 (default, Nov 21 2011, 17:25:27) 
[GCC 4.6.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> def func1(files):
...   files = files[:100]
... 
>>> def func2(files):
...   new_files = files[:100]
... 
>>> dis.dis(func1)
  2           0 LOAD_FAST                0 (files)
              3 LOAD_CONST               1 (100)
              6 SLICE+2             
              7 STORE_FAST               0 (files)
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE        
>>> dis.dis(func2)
  2           0 LOAD_FAST                0 (files)
              3 LOAD_CONST               1 (100)
              6 SLICE+2             
              7 STORE_FAST               1 (new_files)
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE        

The same can be observed in Python 3.

In fact, func1 could even be a bit faster, because the name files has been seen before and could already be in some variable lookup cache.

Niklas B.
  • 92,950
  • 18
  • 194
  • 224
  • Can I ask you what difference is going to make that different `1`? – Rik Poggi Jan 29 '12 at 16:18
  • @Rik: I think it's the index of the affected local variable (`0` in the first case, because this is the first accessed variable, `1` in the second case). I am not 100% sure, though. – Niklas B. Jan 29 '12 at 16:29
1

There really aren't going to be many downsides to reusing variables, except that you're not going to experience many advantages either. The Python GC is going to have to run anyway to collect the old object, so there isn't an immediate memory gain when you override the variable, unlike in statically-compiled languages such as C, where reusing a variable prevents memory allocation entirely for the new object.

Further, you can truly confuse any future readers of your code, who generally expect new objects to have new names (a byproduct of garbage-collected languages).

Joe C.
  • 1,538
  • 11
  • 14
1

The downside would be, that you can't use:

file_rest = files[batch_size:]

Regarding performance there is no downside. On the contrary: you might even improve performance by avoiding hash collision in the same name-space.

There was a SO-post regarding this in an other context.

Community
  • 1
  • 1
Don Question
  • 11,227
  • 5
  • 36
  • 54