I'm just getting started with using Python's mock library to help write more concise and isolated unit tests. My situation is that I've got a class that reads in data from a pretty hairy format, and I want to test a method on this class which presents the data in a clean format.
class holds_data(object):
def __init__(self, path):
"""Pulls complicated data from a file, given by 'path'.
Stores it in a dictionary.
"""
self.data = {}
with open(path) as f:
self.data.update(_parse(f))
def _parse(self, file):
# Some hairy parsing code here
pass
def x_coords(self):
"""The x coordinates from one part of the data
"""
return [point[0] for point in self.data['points']]
The code above is a simplification of what I have. In reality _parse
is a fairly significant method which I have test coverage for at the functional level.
I'd like to be able, however, to test x_coords
at a unit test level. If I were to instantiate this class by giving it a path, it would violate the rules of unit tests because:
A test is not a unit test if:
- It touches the filesystem
So, I'd like to be able to patch the __init__
method for holds_data
and then just fill in the part of self.data
needed by x_coords
. Something like:
from mock import patch
with patch('__main__.holds_data.__init__') as init_mock:
init_mock.return_value = None
instance = holds_data()
instance.data = {'points':[(1,1),(2,2),(3,4)]}
assert(instance.x_coords == [1,2,3])
The code above works but it feels like it's going about this test in a rather roundabout way. Is there a more idiomatic way to patch out a constructor or is this the correct way to go about doing it? Also, is there some code smell, either in my class or test that I'm missing?
Edit: To be clear, my problem is that during initialization, my class does significant amounts of data processing to organize the data that will be presented by a method like x_coords
. I want to know what is the easiest way to patch all of those steps out, without having to provide a complete example of the input. I want to only test the behavior of x_coords
in a situation where I control the data it uses.
My question of whether or not there is code smell here boils down to this issue:
I'm sure this would be easier if I refactor to have x_coords
be a stand alone function that takes holds_data
as a parameter. If "easier to tests == better design" holds, this would be the way to go. However, it would require the x_coords
function to know more about the internals of holds_data
that I would normally be comfortable with. Where should I make the trade off? Cleaner code or cleaner tests?