2

There are several well known python code style rules, which are considered default and to which I try to adhere:

Wrap lines so that they don’t exceed 79 characters.

Keep indents 4 spaces long.

Another common programming advice is

Avoid global variables

In other words, one should always use functions that accept all their variables as parameters and avoid Pascal-like procedures that read directly from higher scope.

However, in some cases, one should definitely break some of these rules. For example, if functions with long parameters lists are involved. There are two separate issues with them:

First, in heavily indented blocks there is too little room left.

def function(variable1, variable2, variable3, variable4, variable5,\
variable6, variable7, variable8, variable9):
    return variable1 + variable2 + variable3 + variable4 + variable5 +\
    variable6 + variable7 + variable8 + variable9
def...
    for variable1...
        if variable 2...
            while variable3...
                if variable4...
                    for variable5...
                        ...
                                                    variable10 =\
                                                    function(\
                                                    variable1,\
                                                    variable2,\
                                                    variable3,\
                                                    variable4,\
                                                    variable5,\
                                                    variable6,\
                                                    variable7,\
                                                    variable8,\
                                                    variable9)
                                                    ...

Here a procedure as a closure, though it is considered a bad practice, can become helpful:

def...
    def procedure():
        variable10 = variable1 + variable2 + variable3 + variable4 +\
        variable5 + variable6 + variable7 + variable8 + variable9
    for variable1...
        if variable 2...
            while variable3...
                if variable4...
                    for variable5...
                        ...
                                                    procedure()
                                                    ...

Another issue (which is actually python-specific) is performance. Copying of function parameters can become quite costly if there is a lot of them:

import time

var1 = 1
var2 = 2
var3 = 3
var4 = 4
var5 = 5
var6 = 6

def function(var1, var2, var3, var4, var5, var6):
    pass

def procedure():
    pass

starttime = time.time()
for i in range(10000000):
    function(var1, var2, var3, var4, var5, var6)
finishtime = time.time()
print('Classical function runtime: {:.3f} s'.format(finishtime - starttime))

starttime = time.time()
for i in range(10000000):
    procedure()
finishtime = time.time()
print('Procedure runtime: {:.3f} s'.format(finishtime - start time))

This outputs:

Classical function runtime: 2.447 s
Procedure runtime: 1.180 s

Therefore, my question, directed to experienced developers, is:

Is there something that can justify use of Pascal-like procedures over classical functions or they should be avoided at any cost, even if they lead to bulkier and slower code?

EDIT:

Using *args and **kwargs solves the issue only partially, as all the parameters still have to be listed during the function call. Also it does not address the performance issue, as parameters are still getting copied. itertools, as proposed in comments, is also not always applicable. In some cases, resolving the nesting is quite tricky (consider the code below) and would take a lot of developing time, probably leading to a very tangled code.

def my_function(**kwargs):
    with open(kwargs['file_a'], 'r') as input1:
        for line1 in input1:
            if kwargs['number'] % 3 == 0:
                if kwargs['number'] % 9 == 0:
                    with open(kwargs['file_0'], 'r') as input2:
                        for line2 in input2:
                            if line1.startswith('#'):
                                if line2.startswith('A'):
                                    with open('output.txt') as output1:
                                        for item in kwargs['items']:
                                            if item is in kwargs['good']:
                                                for sub in item:
                                                    if sub < 0:
                                                        result = process_vars(
                                                            kwargs['var1'],
                                                            kwargs['var2'],
                                                            kwargs['var3'],
                                                            kwargs['var4'],
                                                            kwargs['var5'],
                                                            kwargs['var6'],
                                                            kwargs['var7'],
                                                            kwargs['var8'],
                                                            kwargs['var9'],
                                                            kwargs['var10'])
                                                        output1.write(result)
                                                    elif sub >= 0 and sub < 1:
                                                        output1.write('hello')
                                                    else:
                                                        output1.write('byebye')
                                            elif len(item) > 20:
                                                item = item[: 20]
                                            else:
                                                output1.write(line2)
                                elif line2.startswith('B'):
                                    print('warning')
                                else:
                                    print('error')
                            elif line1.startswith('!'):
                                kwargs['wonders'].count += 1
                            else:
                                kwargs['threats'].append(line1)
                else:
                    kwargs['exceptions'].append(line1)
            elif kwargs['number'] % 3 == 1:
                with open(kwargs['file_1'], 'r') as input2:
                    ...
            elif kwargs['number'] % 3 == 2:
                with open(kwargs['file_2'], 'r') as input2:
                    ...
Community
  • 1
  • 1
Roman
  • 2,225
  • 5
  • 26
  • 55
  • 5
    You should **really** look into reducing the nesting levels here. Use `itertools.product()` for example to remove all those nested `for` loops. – Martijn Pieters Jul 08 '15 at 09:56
  • 2
    And within `(..)` parentheses you don't need backslashes. – Martijn Pieters Jul 08 '15 at 09:56
  • 3
    Rule of thumb: if you need to break line length rules because you are deeply nested: you are nesting too deeply, figure out a more efficient way to avoid nesting. – Martijn Pieters Jul 08 '15 at 09:57
  • 2
    Last but not least, use the `timeit()` module to time performance. And your `procedure()` vs. `function()` case is more likely caused by looking up so many globals than it is to use parameters in a function call. – Martijn Pieters Jul 08 '15 at 09:58
  • In general, your code needs broader refactoring. For example, rather than `(variable1, variable2, ..., variableN)` use `(*variables)`, or group related parameters into a container of some kind. Blaming this on styling rules is a red herring. – jonrsharpe Jul 08 '15 at 10:01
  • If you really have lots of parameters to pass, and that is a measurable overhead, consider passing a dictionary or a list. See also http://stackoverflow.com/questions/12590058/python-performance-with-global-variables-vs-local – cdarke Jul 08 '15 at 10:06
  • Also you can read variables from the global scope without having to use the `global` keyword, you just can't perform assigment to them. So you can avoid passing parameters if you use that intelligently. – SuperBiasedMan Jul 08 '15 at 10:10
  • See [Introduce Parameter Object](http://www.refactoring.com/catalog/introduceParameterObject.html) refactoring. – Peter Wood Jul 08 '15 at 10:20

1 Answers1

0
  1. Someone already mentioned this in comments: Try using itertools.product() if you have two or more nested for loops.

  2. If the above can't be done, atleast you can reduce the indent levels caused by if statements. Example:

    for abc in ..
        if (condition):
            for xyz in ...
    

    Can be changed to:

    for abc in ...
        if not (condition):
            continue
        for xyz in ...
    
  3. If there are many parameters to be passed to a method, try using *args (all parameters in a tuple) or **kwargs (all parameters in a dict).

EDIT: As per your specific requirement, I have edited the given code. This might not produce the exact same result as yours, but will give you a general idea how to implement the above 3 points.

def my_function(**kwargs):
    process_var_list = ['var1', 'var2', 'var3', 'var4', 'var5', 'var6',
                        'var7', 'var8', 'var9', 'var10']
    #Open all 3 files at one go. This will improve performance and reduce nested loops
    with open(kwargs['file_a'], 'r') as input1, open(kwargs['file_0'], 'r') as input2, open('output.txt') as output1:
        #following point 1
        for line1, line2, item in itertools.product(input1, input2, output1):
            if kwargs['number'] % 9 == 0:
                #following point 2
                if line1[0] not in ["#", "!"]:
                    kwargs['threats'].append(line1)
                    continue
                elif line1.startswith('!'):
                    kwargs['wonders'].count += 1
                    continue
                if line2[0] not in ['A', 'B']:
                    print('error')
                    continue
                elif line2.startswith('B'):
                    print('warning')
                    continue
                if item in kwargs['good']:
                    for sub in item:
                        if sub < 0:
                            var_args = {var: kwargs[var] for var in process_var_list}
                            #following point 3
                            result = process_vars(**var_args)
                            output1.write(result)
                        elif 1 < sub >= 0:
                            output1.write('hello')
                        else:
                            output1.write('byebye')
                elif len(item) > 20:
                    item = item[: 20]
                else:
                    output1.write(line2)
            elif kwargs['number'] % 3 == 0:
                kwargs['exceptions'].append(line1)
            elif kwargs['number'] % 3 == 1:
                with open(kwargs['file_1'], 'r') as input2:
                    pass
            elif kwargs['number'] % 3 == 2:
                with open(kwargs['file_2'], 'r') as input2:
                    pass
Sudipta
  • 4,773
  • 2
  • 27
  • 42
  • Could you expand your answer with taking into account my edit? – Roman Jul 08 '15 at 13:01
  • the code yields completely different results. The point is that I cannot apply `itertools.product()`, because I do not know in advance, through which object or file (if any) I will iterate in the loop #2, until I check all the conditions, which in turn is dependent on items generated in loop #1. Also I cannot use `var: kwargs[var]`, as there are variables in `kwargs`, which must not be passed into the function. Therefore, you just reduced the nesting by simplifying the task itself. – Roman Jul 08 '15 at 14:49
  • @Roman Yes that's why I had put the disclaimer that it might not produce same results. It was to give a general idea as to how you can apply the above 3 points. Probably you can try re-factoring the code and use separate functions instead of writing the whole thing in one method. – Sudipta Jul 09 '15 at 07:22
  • Rewriting here is really tricky and time-consuming, because the logic is very branched and after the changes everything must be debugged anew. That's why I was considering either diverting from the standard coding style (reducing indent and having longer lines) or making more separate functions. In later case I was thinking about 'procedures' that work directly with variables from higher scope, as this makes the script shorter, faster, and consuming less memory. However, I am in doubt, as many programmers call this a bad practice. That's what my question about. – Roman Jul 09 '15 at 11:25