0

Let's say I have 3 lists of DataFrames containing different data that I want to run the same test cases on. How do I best structure my files and code so that I have one conftest.py (or some sort of parent class) that contains all the test cases that each list needs to run on, and 3 child classes that have different ways of generating each list of DataFrames but run the same test cases?

This is how I am currently constructing it.

import pytest

class TestOne:
    
    # this method usually takes 10 mins to run
    # so we want this to run once and use the same Dict for all test cases
    dfs: Dict[str, pd.DataFrame] = get_list_of_dfs_somewhere("one")

    def test_dfs_type(self):
        assert isinstance(self.dfs, dict)

    def test_another_one(self):
        assert ...

dfs will not be modified throughout the test suite, so I want to treat this like a setup.

TestTwo and TestThree are the same thing except it will be get_list_of_dfs_somewhere("two") and get_list_of_dfs_somewhere("three")

Any tips on how to efficiently structure this would be appreciated!

javierlu
  • 3
  • 2
  • This is what fixtures are for. You set up `dfs` once and then re-use it for multiple tests. See: https://docs.pytest.org/en/7.1.x/how-to/fixtures.html and also: https://stackoverflow.com/questions/62376252/when-to-use-pytest-fixtures – picobit Nov 11 '22 at 21:04

2 Answers2

0

This can be done using a fixture and defining the scope to the desired fixtures life time with session being the the one that is generated once for the entire testing session.

In the following code the dfs is created only once for the entire testing session and all the test can use it.

from typing import Dict

import pandas as pd
import pytest


def get_list_of_dfs_somewhere(text: str) -> Dict[str, pd.DataFrame]:
    return {
        "one": pd.DataFrame({"a": [1, 2, 3]}),
        "two": pd.DataFrame({"b": [1, 2, 3]}),
        "three": pd.DataFrame({"c": [1, 2, 3]}),
    }


# conftest.py
# this method usually takes 10 mins to run
# so we want this to run once and use the same Dict for all test cases
@pytest.fixture(scope="session")
def dfs() -> Dict[str, pd.DataFrame]:
    return get_list_of_dfs_somewhere("one")


# test_dfs.py
def test_dfs_type(dfs: Dict[str, pd.DataFrame]):
    assert isinstance(dfs, dict)


def test_another_one(dfs: Dict[str, pd.DataFrame]):
    assert ...
Ilya
  • 730
  • 4
  • 16
0

In case if you need to run the same test case but with different data you can use the parametrize function. So, let's say this is you test:

def test_dfs_type():
        assert isinstance(dict)

And you need to run it 3 times. One for each data frame you have.
To do that you can put all the data you need into a list.
But first, let's create the classes (I've simplified them a bit):

# classes.py
class ClassOne:
    # this method usually takes 10 mins to run
    # so we want this to run once and use the same Dict for all test cases
    # dfs: Dict[str, pd.DataFrame] = get_list_of_dfs_somewhere("one")
    dfs: dict[str, str] = {'one': 'class one value'}


class ClassTwo:
    # this method usually takes 10 mins to run
    # so we want this to run once and use the same Dict for all test cases
    # dfs: Dict[str, pd.DataFrame] = get_list_of_dfs_somewhere("two")
    dfs: dict[str, str] = {'two': 'class two value'}


class ClassThree:
    # this method usually takes 10 mins to run
    # so we want this to run once and use the same Dict for all test cases
    # dfs: Dict[str, pd.DataFrame] = get_list_of_dfs_somewhere("three")
    dfs: dict[str, str] = {'three': 'class three value'}

Now, let's create the file with tests:

# test_classes.py

import pytest
from classes import ClassOne, ClassTwo, ClassThree


DATA_FRAMES = [ClassOne.dfs, ClassTwo.dfs, ClassThree.dfs]


@pytest.mark.parametrize('data_frame', DATA_FRAMES)  # Here we create a parameter "data_frame" that will give one object from a list at each test run.
def test_dfs_type(data_frame):  # And here is the arguments we indicate that the test waits for that parameter.
    print(data_frame)  # Print data just to see what happens in each test
    assert isinstance(data_frame, dict)

The result is:

>> pytest -v -s
test_classes.py::test_dfs_type[data_frame0] {'one': 'class one value'}
PASSED
test_classes.py::test_dfs_type[data_frame1] {'two': 'class two value'}
PASSED
test_classes.py::test_dfs_type[data_frame2] {'three': 'class three value'}
PASSED

Eugeny Okulik
  • 1,331
  • 1
  • 5
  • 16
  • thanks, is there a way to have a parent class that contains all the test cases and any child class who extends the parent class would automatically run all the test cases? – javierlu Nov 16 '22 at 19:14
  • I am not sure but I think pytest handles classes just as tests grouping. So it doesn't support inheritance. But I can be wrong. – Eugeny Okulik Nov 17 '22 at 11:00