75

I need to test a function that needs to query a page on an external server using urllib.urlopen (it also uses urllib.urlencode). The server could be down, the page could change; I can't rely on it for a test.

What is the best way to control what urllib.urlopen returns?

Dinoboff
  • 2,622
  • 2
  • 26
  • 26
  • 1
    Not quite the same, but I guess most people use `requests`: [How can I mock requests and the response?](https://stackoverflow.com/q/15753390/562769) – Martin Thoma Jul 05 '20 at 14:11

8 Answers8

101

Another simple approach is to have your test override urllib's urlopen() function. For example, if your module has

import urllib

def some_function_that_uses_urllib():
    ...
    urllib.urlopen()
    ...

You could define your test like this:

import mymodule

def dummy_urlopen(url):
    ...

mymodule.urllib.urlopen = dummy_urlopen

Then, when your tests invoke functions in mymodule, dummy_urlopen() will be called instead of the real urlopen(). Dynamic languages like Python make it super easy to stub out methods and classes for testing.

See my blog posts at http://softwarecorner.wordpress.com/ for more information about stubbing out dependencies for tests.

user
  • 5,335
  • 7
  • 47
  • 63
Clint Miller
  • 15,173
  • 4
  • 37
  • 39
  • 11
    Monkeypatches for testing are a handy thing. Indeed, this is probably the canonical "good monkeypatch" example. – S.Lott Nov 17 '08 at 13:56
  • http://visionandexecution.org seems to be down. Is there another link, or is this gone now? – Mu Mind Jan 26 '11 at 19:20
  • 1
    I haven't posted to the blog in a really long time, but I did port it to http://softwarecorner.wordpress.com/ – Clint Miller Jan 26 '11 at 19:49
  • 15
    Beware! This would mock it out for all instances of urlopen in your test module and other classes in your module if you do not explicitly reset the mocked object back to original value. Ofcourse in this case I am not sure why anyone would want to make network calls in unit tests. I would recommend using something like 'with patch ...' or @patch() which gives you more explicit control on what you are mocking and upto what limits. – Keshi Nov 15 '13 at 04:16
  • Wow., that's the best way to mock anything. Doing the same in JavaScript, good to see the same in Python. Thanks. – Triguna Aug 23 '21 at 17:42
73

I am using Mock's patch decorator:

from mock import patch

[...]

@patch('urllib.urlopen')
def test_foo(self, urlopen_mock):
    urlopen_mock.return_value = MyUrlOpenMock()
Dinoboff
  • 2,622
  • 2
  • 26
  • 26
  • 5
    too bad it does not work when patching module functions :/ (at least not 0.7.2) – Tommaso Barbugli Apr 24 '12 at 13:27
  • 3
    not 100% true, if you import the function before patching it works, otherwise the patching fails silently (no errors, just nothing gets patched :/) – Tommaso Barbugli Apr 24 '12 at 13:44
  • 2
    Good point there; patching should throw errors when it's failed to find the relevant module rather than just failing silently. – fatuhoku Sep 15 '13 at 09:57
  • 3
    It gives me the an error. fixture 'urlopen_mock' not found – Pratik Khadloya May 09 '14 at 02:05
  • If you patch urllib.urlopen directly, any references to it that have already been imported by a module will remain unpatched. To avoid that, patch the imported reference instead. ex: patch('mymodule.urlopen') – lfagundes Oct 04 '21 at 09:18
27

Did you give Mox a look? It should do everything you need. Here is a simple interactive session illustrating the solution you need:

>>> import urllib
>>> # check that it works
>>> urllib.urlopen('http://www.google.com/')
<addinfourl at 3082723820L ...>
>>> # check what happens when it doesn't
>>> urllib.urlopen('http://hopefully.doesnotexist.com/')
#-- snip --
IOError: [Errno socket error] (-2, 'Name or service not known')

>>> # OK, let's mock it up
>>> import mox
>>> m = mox.Mox()
>>> m.StubOutWithMock(urllib, 'urlopen')
>>> # We can be verbose if we want to :)
>>> urllib.urlopen(mox.IgnoreArg()).AndRaise(
...   IOError('socket error', (-2, 'Name or service not known')))

>>> # Let's check if it works
>>> m.ReplayAll()
>>> urllib.urlopen('http://www.google.com/')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.5/site-packages/mox.py", line 568, in __call__
    raise expected_method._exception
IOError: [Errno socket error] (-2, 'Name or service not known')

>>> # yay! now unset everything
>>> m.UnsetStubs()
>>> m.VerifyAll()
>>> # and check that it still works
>>> urllib.urlopen('http://www.google.com/')
<addinfourl at 3076773548L ...>
Romuald Brunet
  • 5,595
  • 4
  • 38
  • 34
Damir Zekić
  • 15,630
  • 2
  • 34
  • 35
  • quoting https://pypi.org/project/mox/: "New uses of this library are discouraged. People are encouraged to use https://pypi.python.org/pypi/mock instead which matches the unittest.mock library available in Python 3." – Florian Dec 10 '19 at 14:02
15

HTTPretty works in the exact same way that FakeWeb does. HTTPretty works in the socket layer, so it should work intercepting any python http client libraries. It's battle tested against urllib2, httplib2 and requests

import urllib2
from httpretty import HTTPretty, httprettified


@httprettified
def test_one():
    HTTPretty.register_uri(HTTPretty.GET, "http://yipit.com/",
                           body="Find the best daily deals")

    fd = urllib2.urlopen('http://yipit.com')
    got = fd.read()
    fd.close()

    assert got == "Find the best daily deals"
edwardmp
  • 6,339
  • 5
  • 50
  • 77
Gabriel Falcão
  • 1,075
  • 1
  • 13
  • 9
  • In 2013, this is definitively the best answer. Let's vote Falcão's awesome library up, guys! – fatuhoku Sep 15 '13 at 10:00
  • Coming from a Obj-C angle, I was looking for something like [OHHTTPStubs](https://github.com/AliSoftware/OHHTTPStubs) for Python. I'm delighted to find HTTPretty. – fatuhoku Sep 15 '13 at 10:01
9

In case you don't want to even load the module:

import sys,types
class MockCallable():
  """ Mocks a function, can be enquired on how many calls it received """
  def __init__(self, result):
    self.result  = result
    self._calls  = []

  def __call__(self, *arguments):
    """Mock callable"""
    self._calls.append(arguments)
    return self.result

  def called(self):
    """docstring for called"""
    return self._calls

class StubModule(types.ModuleType, object):
  """ Uses a stub instead of loading libraries """

  def __init__(self, moduleName):
    self.__name__ = moduleName
    sys.modules[moduleName] = self

  def __repr__(self):
    name  = self.__name__
    mocks = ', '.join(set(dir(self)) - set(['__name__']))
    return "<StubModule: %(name)s; mocks: %(mocks)s>" % locals()

class StubObject(object):
  pass

And then:

>>> urllib = StubModule("urllib")
>>> import urllib # won't actually load urllib

>>> urls.urlopen = MockCallable(StubObject())

>>> example = urllib.urlopen('http://example.com')
>>> example.read = MockCallable('foo')

>>> print(example.read())
'foo'
ilpoldo
  • 506
  • 6
  • 7
  • Close, but the import function won't actually import stuff. So a caller using from urllib import * ... won't get the functions they need – Erik Aronesty Dec 28 '16 at 21:16
8

Probably the best way to handle this is to split up the code, so that logic that processes the page contents is split from the code that fetches the page.

Then pass an instance of the fetcher code into the processing logic, then you can easily replace it with a mock fetcher for the unit test.

e.g.

class Processor(oject):
    def __init__(self, fetcher):
        self.m_fetcher = fetcher

    def doProcessing(self):
        ## use self.m_fetcher to get page contents

class RealFetcher(object):
    def fetchPage(self, url):
        ## get real contents

class FakeFetcher(object):
    def fetchPage(self, url):
        ## Return whatever fake contents are required for this test
Douglas Leeder
  • 52,368
  • 9
  • 94
  • 137
3

The simplest way is to change your function so that it doesn't necessarily use urllib.urlopen. Let's say this is your original function:

def my_grabber(arg1, arg2, arg3):
    # .. do some stuff ..
    url = make_url_somehow()
    data = urllib.urlopen(url)
    # .. do something with data ..
    return answer

Add an argument which is the function to use to open the URL. Then you can provide a mock function to do whatever you need:

def my_grabber(arg1, arg2, arg3, urlopen=urllib.urlopen):
    # .. do some stuff ..
    url = make_url_somehow()
    data = urlopen(url)
    # .. do something with data ..
    return answer

def test_my_grabber():
    my_grabber(arg1, arg2, arg3, urlopen=my_mock_open)
Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
  • 4
    Not sure that I like having the fixture under test aware of configuration details... However, this does work. – S.Lott Nov 17 '08 at 13:35
  • 1
    I don't see anything wrong with parameterizing the function. There's no knowledge here of how urlopen might be faked or why, just that it might happen. – Ned Batchelder Nov 17 '08 at 15:50
0

Adding onto Clint Miller's answer, to do this I had to create a fake class that implements a read method like this:

class FakeURL:
    def read(foo):
        return '{"some":"json_text"}'

Then to stub out urllib2.open:

# Stub out urllib2.open.
def dummy_urlopen(foo, bar, baz):
  return FakeURL()
urllib2.urlopen = dummy_urlopen
Alex Harvey
  • 14,494
  • 5
  • 61
  • 97