14

I have a Python project in which I read external files, process them, and write the results to a new file. The input files can either be read directly, or extracted from a git repository using git show. The function to call git show and return stdout looks like this:

def git_show(fname, rev):
    '''Runs git show and returns stdout'''
    process = subprocess.Popen(['git', 'show', '{}:{}'.format(rev, fname)],
                               stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    stdout, stderr = process.communicate()
    ret_code = process.wait()
    if ret_code:
        raise Exception(stderr)
    return stdout

I have unittests which test the whole processing part of the program, i.e., everything apart from reading and writing the files. However, I have stumbled upon (and fixed) issues regarding the encoding of the returned string from git_show(), depending Python version, and quite possibly OS and the actual file to read.

I would like to set up a unittest for git_show() so I can make sure the whole application works, from input to output. However, as far as I know, this is not possible without having an actual git repository to test on. The whole package is version managed with git, and I expect that if I have a git repository inside a git repository that might lead to problems on its own, and a voice in my head tells me that might not be the best solution anyway.

How can one best achieve unittesting code which gets input from git show (and in general, the command line / Popen.communicate())?

cmeeren
  • 3,890
  • 2
  • 20
  • 50

2 Answers2

5

So the way I do this is with pytest

Example: (contrived)

from subprocess import Popen, PIPE


def test():
    p = Popen(["echo", "Hello World!"], stdout=PIPE)
    stdout, _ = p.communicate()

    assert stdout == b"Hello World!\n"

Output:

$ py.test -x -s test_subprocess.py 
======================================= test session starts ========================================
platform linux2 -- Python 2.7.9 -- py-1.4.28 -- pytest-2.7.1
rootdir: /home/prologic/work/circuits, inifile: 
plugins: cov
collected 1 items 

test_subprocess.py .

===================================== 1 passed in 0.01 seconds =====================================

Or using the standard library unittest:

Example:

#!/usr/bin/env python


from unittest import main, TestCase


from subprocess import Popen, PIPE


class TestProcess(TestCase):

    def test(self):
        p = Popen(["echo", "Hello World!"], stdout=PIPE)
        stdout, _ = p.communicate()

        self.assertEquals(stdout, b"Hello World!\n")


if __name__ == "__main__":
    main()

Output:

$ python test_subprocess.py 
.
----------------------------------------------------------------------
Ran 1 test in 0.001s

OK
James Mills
  • 18,669
  • 3
  • 49
  • 62
  • 1. Does this require ``py.test`` or is ``unittest`` sufficient? 2. You're boomeranging a string originating from inside the python file. Are there important encoding implications arising from this, or will ``stdout.read()`` in your example definitely return the exact same string/encoding as if you're calling ``git show`` on a file with the same string, no matter how the file is saved? – cmeeren May 29 '15 at 11:20
  • No; pytest is not required; you *can* use unittest. You will get bytes in Python 3.x and str in Python 2.x (*equivalent to bytes*); ``stdout`` will contain the output of ``git show`` – James Mills May 29 '15 at 11:22
  • Thanks, will test it and see. Is this true for Win/Mac/Linux, and should ``echo`` work on all three? – cmeeren May 29 '15 at 11:25
  • I can't *vouch* for Windows; but Linux and Mac sure. – James Mills May 29 '15 at 11:26
  • When I try this (with ``shell=True`` which is the only way to access ``echo`` on Windows), ``stdout`` only contains the first line of the input string. Is there any way to make this work with multiline strings? – cmeeren May 29 '15 at 12:01
  • ``echo -e`` to interpreter ``\n`` chars int eh string? -- Sorry I have no idea wrt Windows systems – James Mills May 29 '15 at 12:06
5

Perhaps you want (one of combination of) different kinds of tests.

Unit tests

Test a small part of your code, within your code.

  1. mock out subprocess.Popen
  2. return static values in stdout, stderr
  3. check that processing is correct

Sample code is pretty small, you can only test that stdout is really returned and that upon non-zero wait() an exception is raised.

Something in between

Test vectors, that is given set input, set output should be produced

  1. mock out git, instead use cat vector1.txt encoded in specific way
  2. test result

Integration tests

Test how your code connects to external entities, in this case git. Such tests protects you from accidentally changing the expectation of the internal system. That is it "freezes" the API.

  1. create a tarball with a small git repository
  2. optionally pack git binary into same tarball
  3. unpack the tarball
  4. run git command
  5. compare output to expected
Dima Tisnek
  • 11,241
  • 4
  • 68
  • 120