25

When writing unit tests, I sometimes cut and paste a test and don't remember to change the method name. This results in overwriting the previous test, effectively hiding it and preventing it from running. For example;

class WidgetTestCase(unittest.TestCase):

  def test_foo_should_do_some_behavior(self):
    self.assertEquals(42, self.widget.foo())

  def test_foo_should_do_some_behavior(self):
    self.widget.bar()
    self.assertEquals(314, self.widget.foo())

In this case, only the latter test would get called. Is there a way of programmatically catching this sort of error, short of parsing the raw source code directly?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Scotty Allen
  • 12,897
  • 9
  • 38
  • 51
  • 1
    Copying and pasting is generally a sign of bad coding practice - most of the time it means you should be extracting functionality out into another function. It'll save yourself time, effort if you want to change it later, and problems like these. – Gareth Latty May 25 '12 at 22:21
  • 14
    Lattyware: In general I agree. However, this is unit test code, in which it's best practice and expected that you have a bunch of short methods, most of which look quite similar except for different setup conditions and assertions. So I would say that cut and paste is par for the course, and not representative of bad practice. – Scotty Allen May 25 '12 at 22:25
  • @Scotty: Why don't you just get into the habit of renaming a method as your first action after copying it? By the way, if setup code and assertions differ for each method, then the only thing you are really copying is the method's name, which you might have noticed. – Niklas B. May 25 '12 at 22:25
  • 3
    @ScottyAllen Why are tests any different to your other code? If they share a lot of functionality, then extract out common code. If they don't, then don't copy and paste other functions to make them. – Gareth Latty May 25 '12 at 22:27
  • Niklas: I am in that habit, but occasionally forget in the heat of battle:) This is one of those errors that isn't frequent, but is particularly evil when it does occur. – Scotty Allen May 25 '12 at 22:27
  • `grep -c` the test file for `'def .*(self.*):'` and compare to the number of tests being run :) You can even add it to the `test.py` itself, but there is a chance of falling into a diabolic loop. – Lev Levitsky May 25 '12 at 22:28
  • @Scotty: Then I'd suggest only ever copying the body, never the name and parameter list. As a second step, you can then realize that if you really have so much duplication that copying method bodies is actually helpful, that you should restructure your code. – Niklas B. May 25 '12 at 22:28
  • 1
    Lattyware: How would you refactor the above example to avoid cut and paste? In my experience, it's representative of unit test code I've written and encountered in practice. – Scotty Allen May 25 '12 at 22:30
  • 2
    @Scotty: I suspect Darthfett's answer about pylint (and similar tools) is the most practical answer. However on http://stackoverflow.com/questions/10762088/python-metaclasses-why-isnt-setattr-called-for-attributes-set-during-class (which links here) I posted a (giant) answer that includes how you can detect this error if you're using Python3; it's over there rather than here because you're probably not using Python3 and it's more relevant to the guts of why `__setattr__` doesn't help here. – Ben May 26 '12 at 03:03

5 Answers5

28

If you run Pylint over your code, it will inform you when you have overwritten another method:

For example, I ran this:

class A(object):
    def blah(self):
        print("Hello, World!")

    def blah(self):
        print("I give up!")

In this online Pylint checker. Besides all the missing docstrings and such, I get this:

E: 5:A.blah: method already defined line 2

Alternatively, via the command line:

python -m pyflakes .

Output:

.\blah.py:5:5 redefinition of unused 'blah' from line 2
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Casey Kuball
  • 7,717
  • 5
  • 38
  • 70
  • 4
    Or, if you want a linter that isn't quite as annoying, there's pyflakes: https://launchpad.net/pyflakes – habnabit May 26 '12 at 00:22
15

What follows is a horrible hack that uses undocumented, implementation-specific Python features. You should never ever ever do anything like this.

It's been tested on Python 2.6.1 and 2.7.2; doesn't seem to work with Python 3.2 as written, but then, you can do this right in Python 3.x anyway.

import sys

class NoDupNames(object):

    def __init__(self):
        self.namespaces = []

    def __call__(self, frame, event, arg):
        if event == "call":
            if frame.f_code.co_flags == 66:
                self.namespaces.append({})
        elif event in ("line", "return") and self.namespaces:
            for key in frame.f_locals.iterkeys():
                if key in self.namespaces[-1]:
                    raise NameError("attribute '%s' already declared" % key)
            self.namespaces[-1].update(frame.f_locals)
            frame.f_locals.clear()
            if event == "return":
                frame.f_locals.update(self.namespaces.pop())
        return self

    def __enter__(self):
        self.oldtrace = sys.gettrace()
        sys.settrace(self)

    def __exit__(self, type, value, traceback):
        sys.settrace(self.oldtrace)

Usage:

with NoDupNames():
    class Foo(object):
        num = None
        num = 42

Result:

NameError: attribute 'num' already declared

How it works: We hook up to the system trace hook. Each time Python is about to execute a line, we get called. This allows us to see what names were defined by the last statement executed. To make sure we can catch duplicates, we actually maintain our own local variable dictionary and clear out Python's after each line. At the end of the class definition, we copy our locals back into Python's. Some of the other tomfoolery is in there to handle nested class definitions and to handle multiple assignments in a single statement.

As a downside, our "clear ALL the locals!" approach means you can't do this:

with NoDupNames():
    class Foo(object):
        a = 6
        b = 7
        c = a * b

Because as far as Python knows, there are no names a and b when c = a * b is executed; we cleared those as soon as we saw 'em. Also, if you assign the same variable twice in a single line (e.g., a = 0; a = 1) it won't catch that. However, it works for more typical class definitions.

Also, you should not put anything besides class definitions inside a NoDupNames context. I don't know what will happen; maybe nothing bad. But I haven't tried it, so in theory the universe could be sucked into its own plughole.

This is quite possibly the most evil code I have ever written, but it sure was fun!

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
kindall
  • 178,883
  • 35
  • 278
  • 309
6

Here is one option for how to detect this at runtime using decorators without the need for any analysis tool:

def one_def_only():
  names = set()
  def assert_first_def(func):
    assert func.__name__ not in names, func.__name__ + ' defined twice'
    names.add(func.__name__)
    return func
  return assert_first_def

class WidgetTestCase(unittest.TestCase):
  assert_first_def = one_def_only()

  @assert_first_def
  def test_foo_should_do_some_behavior(self):
    self.assertEquals(42, self.widget.foo())

  @assert_first_def
  def test_foo_should_do_some_behavior(self):
    self.widget.bar()
    self.assertEquals(314, self.widget.foo())

Example of an attempt to import or run:

>>> import testcases
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "testcases.py", line 13, in <module>
    class WidgetTestCase(unittest.TestCase):
  File "testcases.py", line 20, in WidgetTestCase
    @assert_first_def
  File "testcases.py", line 7, in assert_first_def
    assert func.__name__ not in names, func.__name__ + ' defined twice'
AssertionError: test_foo_should_do_some_behavior defined twice
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Andrew Clark
  • 202,379
  • 35
  • 273
  • 306
4

You cannot easily/cleanly detect it during runtime since the old method is simply replaced and a decorator would have to be used on every function definition. Static analysis (Pylint, etc.) is the best way to do it.

I just tested it and __setattr__ of the metaclass is not called for stuff defined in the class block.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
ThiefMaster
  • 310,957
  • 84
  • 592
  • 636
  • 1
    Sad that `__setattr__` isn't a viable solution - it sounded like a clever approach:) – Scotty Allen May 25 '12 at 22:32
  • 2
    Just a quick clarification for future readers, it's not that the metaclass `__setattr__` isn't called for methods, but that it's only called when setting attributes on the class **object**, which doesn't exist yet when names defined in the class block are being bound. More detail in ThiefMaster's followup question. – Ben May 27 '12 at 02:00
1
Solution via CI/CD

If you've got a build (e.g., Jenkins CI/CD) that runs tests once a pull request is raised you could add something like pylint --fail-under=7 --fail-on=E0102 paths_of_files_changed

This means that if Function defined alreadyE0102 or the quality of the code is less than 7 return a non-zero exit code.

Which can then be used to fail the build.

Client-side solution via a pre-commit

Alternatively, you may be interested in integrating this with a git pre commit hook via which will allow you to execute specific commands when you commit and if they fail, then it won't allow you to commit.

In the root of your checkout create a file called .pre-commit-config.yaml with the following:

# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
-   repo: https://github.com/PyCQA/pylint
    rev: v2.9.6
    hooks:
    -   id: pylint
        args: [--fail-under=7, --fail-on=E0102]
Install pre-commit

(The preferred way is to) use pipx or do this outside a virtualenv since it's a CLI application.

python3 -m pip install pre-commit
Install the hooks
pre-commit install # You only do this once per "git clone"

You cannot commit until all functions are defined only once. Both these solutions will stop any any method being defined multiple times get committed into the codebase.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Mark
  • 1,337
  • 23
  • 34
  • `--fail-under` is required. If it's not specified, pylint will have this set to `--fail-under=10` by default and therefore without specifying a minimum value it will always return a non-zero exit code and then you will never be able to commit until you have zero issues with your code flagged by pylint. – Mark Aug 28 '21 at 14:34
  • To only find this issue : `pylint --disable=all --enable=E0102 ` – Thomasleveil Sep 14 '22 at 16:04