4

I have a short bit of code that needs to run for a long long time. I am wondering if the length of the variable's names that I use can alter the speed at which the program executes. Here is a very simple example written in Python.

Program A

    x = 1
    while not x == 0:
          print('message')

Program B

    xyz = 1
    while not xyz == 0:
          print('message')

Will Program A print 'message' more times than Program B if I run program A and program B for 30 years on two identical machines.

XisUnknown
  • 125
  • 1
  • 1
  • 10

3 Answers3

8

No, the names themselves have no effect on how quickly the resulting code runs. Variable names are just used to distinguish in the Python source two variables that are represented by integer indices into a lookup table:

>>> dis.dis('x=1')
  1           0 LOAD_CONST               0 (1)
              2 STORE_NAME               0 (x)
              4 LOAD_CONST               1 (None)
              6 RETURN_VALUE
>>> dis.dis('xyz=1')
  1           0 LOAD_CONST               0 (1)
              2 STORE_NAME               0 (xyz)
              4 LOAD_CONST               1 (None)
              6 RETURN_VALUE
>>> dis.dis('x=1;xyz=2;')
  1           0 LOAD_CONST               0 (1)
              2 STORE_NAME               0 (x)
              4 LOAD_CONST               1 (2)
              6 STORE_NAME               1 (xyz)
              8 LOAD_CONST               2 (None)
             10 RETURN_VALUE

In the first two, you'll notice no distinction based the variable name is made in the resulting byte code. In the last, you'll see that the byte code differentiates between the two, but only on the order in which they are defined, not the length of the label.

chepner
  • 497,756
  • 71
  • 530
  • 681
  • How often does the code go through the process of allocating the value to the integer based of the name it is given by me? Does this happen every time variable is assigned a new value? Sorry if I am being redundant, I just really want to make sure I understand. My code works around 2 variables and a sequential function that generate a cyclic group of an extremely high order. High enough that to generate the entire group will take years, thus stressing the importance on optimization. – XisUnknown Dec 09 '18 at 06:03
  • 1
    Just once, when the source is compiled to byte code. The process is done before you ever start executing it. – chepner Dec 09 '18 at 15:52
1

The results that @chepner mentioned are correct, Python can take longer to run the code in the console, but once the code is compiled the results are the same.

To make sure that this is correct, I created the following code also inspired by the answer from @knifer:

from time import time
from numpy import average,std

x                                              = 1
xyzabcdexyzabcdefghidjakeldkjlkfghidjakeldkjlk = 1

short_runs = 0
long_runs  = 0

for _ in range(int(2e7)):
    
    t0 = time()
    if x:
        pass
    short_runs += time() - t0
    
    t0 = time()
    if xyzabcdexyzabcdefghidjakeldkjlkfghidjakeldkjlk:
        pass
    long_runs  += time() - t0

print('Runtime results:')
print(f"Small variable runs : (sum = {short_runs:.3f})")
print(f"Long  variable runs : (sum = {long_runs :.3f})")

The code I propose is somewhat different, in the sense that the trial runs for the long and the short variable names are intertwined, such that any differences caused by underlying OS processes are minimized.

The results of the code vary depending on whether you copy-paste the code into a Python console, or you call the code as a program (python trial_runs.py). Runs using copy-paste tend to be slower using long variables names, whereas calling the code as a program yields identical running times.

PS. The actual running times change all the time for me (in one direction or the other), so it's hard to report exact values. Even the long variable names can sometimes run faster, although this is very rare on the Python console. The biggest conclusion is that any differences are really small either way :)

C-3PO
  • 1,181
  • 9
  • 17
-1

The difference is very small and we cant conclude that is because of name of variable.

import timeit
x=1
xyz=1


start_time = timeit.default_timer()
for i in range(1,1000000):
    if x==1:
        print("message")
elapsed = timeit.default_timer() - start_time


start_time2 = timeit.default_timer()
for i in range(1,1000000):
    if xyz==1:
        print("message")

elapsed2 = timeit.default_timer() - start_time2

print("small variable printing = ",str(elapsed),"big variable printing = "+str(elapsed2))

And the Result was :

small variable printing =  3.6490847053481588 big variable printing = 3.7199463989460435
knifer
  • 9
  • 5
  • I ran similar tests and my results varied. Two things that would make the test more accurate to my situation and I should probably have included, in each iteration assign the variable message a new value (even if it is the same value every time) and the iteration should be on the order of 1,000,000,000 since my actual program runs about 7.97EE13. – XisUnknown Dec 09 '18 at 06:14
  • The running speed of a Python script can vary depending on the underlying OS processes being executed at that moment. It's best to make a statistical analysis running the code multiple times for each case, preferably wrapped inside a `function`. If you only run a small `for-loop` in the console, Python will re-interpret the code all the time, which can indeed increase the running times (if the variable names are long). But as @chepner mentioned before, these differences will not exist once Python has already compiled the program into byte code. – C-3PO May 18 '21 at 15:32