55

I've written a specialized HTML parser, that I want to unit test with a couple of sample webpages I've downloaded.

In Java, I've used class resources, to load data into unit tests, without having to rely on them being at a particular path on the file system. Is there a way to do this in Python?

I found the doctest.testfile() function, but that appears to be specific to doctests. I'd like to just get a file handle, to a particular HTML file, which is relative to the current module.

Thanks in advance for any suggestions!

cberner
  • 3,000
  • 3
  • 22
  • 34

4 Answers4

87

To load data from a file in a unittest, if the testdata is on the same dir as unittests, one solution :

TESTDATA_FILENAME = os.path.join(os.path.dirname(__file__), 'testdata.html')


class MyTest(unittest.TestCase)

   def setUp(self):
       self.testdata = open(TESTDATA_FILENAME).read()

   def test_something(self):
       ....
Anto
  • 6,806
  • 8
  • 43
  • 65
Ferran
  • 14,563
  • 2
  • 21
  • 12
  • 6
    Using the 'new' python [pathlib](https://docs.python.org/3.8/library/pathlib.html#module-pathlib) this command becomes: `TEST_FILE = pathlib.Path(__file__).parent.joinpath("testdata.html")` and then `TEST_FILE.open()`. Remember to `import pathlib` and please note the suggestion in the answer from @LeilaHC to properly close the testfile again, e.g. in the tearDown method. – Kim May 06 '20 at 15:32
  • [`pkg_resources`](https://setuptools.readthedocs.io/en/latest/pkg_resources.html) is a tool that comes with `setuptools`. If a package is installed, even with just `pip install -e` or `./setup.py develop`, then `pkg_resources` can find resource files just by knowing `__name__`. – kojiro Nov 07 '20 at 01:46
24

This is based on Ferran's answer, but it closes the file during MyTest.tearDown() to avoid 'ResourceWarning: unclosed file':

TESTDATA_FILENAME = os.path.join(os.path.dirname(__file__), 'testdata.html')


class MyTest(unittest.TestCase)

   def setUp(self):
       self.testfile = open(TESTDATA_FILENAME)
       self.testdata = self.testfile.read()

   def tearDown(self):
       self.testfile.close()

   def test_something(self):
       ....
Leila Hadj-Chikh
  • 1,653
  • 17
  • 16
2

You can also use a StringIO or cStringIO to simulate a string containing your file's contents as a file.

Christian Alis
  • 6,556
  • 5
  • 31
  • 29
  • 2
    Ya, I thought of that, but it would require that I put all the HTML into a python file as a string, which I don't like as it's > 3k lines long – cberner Nov 09 '11 at 00:20
1

I guess your task boils down to what's given here to get the current file. Then extend that path by the path to you HTML file and open it.

Community
  • 1
  • 1
Johannes Charra
  • 29,455
  • 6
  • 42
  • 51