0

Currently trying to figure out the best practice when it comes to testing nested with-statements while avoiding unnecessary boilerplate code. Here is a snippet of the code which I would like to test the behavior of:

def my_func(input_file: Path, output_file: Path) -> None:
    """
    This function is used to extract a pattern and write it to an output file.
    """

    # Pattern to extract
    regex_pattern = r"some pattern"
    # Check if the input file exists
    if not os.path.exists(input_file):
        raise Exception(f"The input file: {os.path.abspath(input_file)} does not exist!")

    # Use the string representation of the path object:
    try:
        # Open the input file and output file for reading and writing
        with open(input_file, "r") as input_file_handle:
            with open(output_file, "w") as output_file_handle:
                for line in input_file_handle:
                    if re.match(regex_pattern, line):
                        output_file_handle.write(line)
        print("Extraction complete!")
    
    # TODO: Exception scope too wide, (fix me)
    except Exception as e:
        print(e)

What would be the best approach to use for writing a test case?

The easiest way I could think of is having a set of input data that gets parsed through the test function, using a pre-defined fixture for creating a /tmp/ folder to store the context of the input data for writing to the input file then calling my_func() and reading the content of the output file and assert that the data matches the expected output.

import os
from pathlib import Path
import pytest
import tempfile
import shutil


@pytest.fixture()
def mk_temp_dir():
    """
    Fixture for pytest
    """
    # set up
    temp_dir = tempfile.mkdtemp()

    yield temp_dir

    # tear down
    shutil.rmtree(mk_temp_dir)

def test_key_features(mk_temp_dir: str) -> None:
    """
    This function is used to set up the test environment.
    """
    print("Setting up the test environment...")
    input_lines = [
        "aaa",
        "aa",
        "bbb",
        "bb",
        "ccc",
    ]
    expected_lines = ["aaa", "bbb", "ccc"]
    # setup a tmp directory for testing
    mk_temp_dir = Path(temp_dir)

    # write a few lines for testing to the input file
    with open(mk_temp_dir / "input.txt", mode="w") as input_file:
        print(os.listdir(mk_temp_dir))
        input_file.write("\n".join(input_lines))

    # DEBUG: read the content of the input file
    with open(mk_temp_dir / "input.txt", mode="r") as input_file_handle:
        input_lines = input_file_handle.read().splitlines()
        print(input_lines)

    # pass it to the function
    key_features.extract_key_features(
        mk_temp_dir / "input.txt",
        mk_temp_dir / "output.txt",
    )
    print("Test environment setup complete!")

    with open(mk_temp_dir / "output.txt", mode="r") as output_file:
        # read the ouput file and assert the expected data is there
        assert output_file.read().splitlines() == expected_lines

But it seems like there's might be an issue with the function itself since the test case seems to cover the functionality.

  • I'd just use actual files in my test, personally. Makes it easier to have confidence that the test is testing exactly what you think it is -- does this input file result in this output file? But there are ways to mock `open()`. – Samwise Feb 18 '22 at 14:53
  • @Samwise The easiest way I could think of is as you said have a set of input data that gets parsed through the test function, using a fixture for creating a /tmp/ folder to store the context of the input data for writing to the input file then calling **my_func()** and reading the content of the output file and assert that the data matches the expected output. ```code – Marouane Skandaji Feb 18 '22 at 19:28

0 Answers0