1

I am looking to set up a data-driven approach for my python selenium project (there is none currently). Planning to have the data file as xlsx.

I use pytest in my project. Hence, I explored ddt, @data, @unpack and pytest.mark.parametrize.

I am able to read my excel values as pass them with @data-unpack or parametrize. However, in my case, each of my tests will use selected columns from my data file - not all.

eg) My data list will be like this (user, password, item_number, item_name)[('user1', 'abc', 1, 'it1234')('user2', 'def',2, 'it5678')]

My function1 (test 1) will need to parameterize user and password columns only. My function2 (test 2) will need to parameterize item_number and item_name columns only.

What library or method can I use for my need? Basically, I need to be able to parameterize specific columns from my data file for my tests.

Shayan Shafiq
  • 1,447
  • 5
  • 18
  • 25

1 Answers1

0

I wrote a library called Parametrize From File that can load test parameters from data files like this. But I'm not sure that I fully understand your example. If this was your data file...

user password item number item name
A B C D
E F G H

...would these be the tests you want to run?

@pytest.mark.parametrize(
        'user, password', 
        [('A', 'B'), ('E', 'F')],
)
def test_1(user, password):
    assert ...

@pytest.mark.parametrize(
        'iterm_number, item_name', 
        [('C', 'D'), ('G', 'H')],
)
def test_2(user, password):
    assert ...

In other words, are the user/password columns completely unrelated to the item_number/item_name columns? If no, I'm misunderstanding your question. If yes, this isn't very scalable. It's easy to imagine writing 100 tests, each with 2+ parameters, for a total of >200 columns! This format also breaks the convention that every value in a row should be related in some way. I'd recommend either putting the parameters for each test into their own file/worksheet, or using a file format that better matches the list-of-tuples/list-of-dicts structure expected by pytest, e.g. YAML, TOML, NestedText, etc.

With all that said, here's how you would load parameters from an xlsx file using Parametrize From File:

import pandas as pd
from collections import defaultdict
import parametrize_from_file as pff

def load_xlsx(path):
    """
    Load an xlsx file and return the data structure expected by Parametrize 
    From File, which is a map of test names to test parameters.  In this case, 
    the xlsx file doesn't specify any test names, so we use a `defaultdict` to 
    make all the parameters available to any test.
    """
    df = pd.read_excel(path)
    return defaultdict(lambda: df)

def get_cols(cols):
    """
    Extract specific columns from the parameters loaded from the xlsx file.  
    The parameters are loaded as a pandas DataFrame, and need to be converted 
    into a list of dicts in order to be understood by Parametrize From File.
    """
    def _get_cols(df):
        return df[cols].to_dict('records')
    return _get_cols

# Use the function we defined above to load xlsx files.
pff.add_loader('.xlsx', load_xlsx)

@pff.parametrize(preprocess=get_cols(['user', 'password']))
def test_1(user, password):
    pass

@pff.parametrize(preprocess=get_cols(['item_number', 'item_name']))
def test_2(item_number, item_name):
    pass

Note that this code would be much simpler if the parameters were organized in one of the formats I recommended above.

Kale Kundert
  • 1,144
  • 6
  • 18
  • Thanks Kale, your understanding is correct - in my case, my tests use selected cols in the sheet. I may not have too many cols, but yes they are used selectively in my tests. I will try this out and explore the other file formats as well, thank you! – harmonybells Nov 04 '21 at 13:38