Python hypothesis: Ensure that input lists have same length

Question

I'm using hypothesis to test a function that takes two lists of equal length as input.

import hypothesis.strategies as st
from hypothesis import assume, given


@given(st.lists(ints, min_size=1),
       st.lists(ints, min_size=1),
       )
def test_my_func(x, y):
    assume(len(x) == len(y))

    # Assertions

This gives me the error message:

FailedHealthCheck: It looks like your strategy is filtering out a lot of data. Health check found 50 filtered examples but only 4 good ones.

The assumption that len(x) == len(y) is filtering out too many inputs. So I would like to generate a random positive number and use that as the length of both x and y. Is there a way this can be done?

So when you pick this random positive number, what do you want to do to the lists to make them conform — Rushabh Mehta, Jul 30 '18 at 15:10

Vermillion · Answer 1 · 2018-08-02T13:36:41.667

15

I found an answer using the @composite decorator.

import hypothesis.strategies as st
from hypothesis import given

@st.composite
def same_len_lists(draw):

    n = draw(st.integers(min_value=1, max_value=50))
    fixed_length_list = st.lists(st.integers(), min_size=n, max_size=n)

    return (draw(fixed_length_list), draw(fixed_length_list))


@given(same_len_lists())
def test_my_func(lists):

    x, y = lists

    # Assertions

edited Aug 02 '18 at 13:36

answered Jul 30 '18 at 17:34

Vermillion

1,238
1
14
29

1

This will return a tuple with a single list in both positions - you probably want to remove the draw() call so that `fixed_length_list` is a strategy, then `return (draw(ffl), draw(ffl))`. – Zac Hatfield-Dodds Jul 31 '18 at 12:47
@ZacHatfield-Dodds will that cause a change in the test cases? I don't fully understand how draw() works. – Vermillion Jul 31 '18 at 14:51
What is draw in this example code? – myke Mar 09 '22 at 12:21
1

@myke It's provided by the `composite` decorator. See the hypothesis docs: https://hypothesis.readthedocs.io/en/latest/data.html#composite-strategies – Vermillion Mar 09 '22 at 22:17

score 6 · Accepted Answer · answered Jul 30 '18 at 15:57

You can use flatmap to generate data that depends on other generated data.

import hypothesis.strategies as st
from hypothesis import assume, given
from hypothesis.strategies import integers as ints

same_len_lists = ints(min_value=1, max_value=100).flatmap(lambda n: st.lists(st.lists(ints(), min_size=n, max_size=n), min_size=2, max_size=2))

@given(same_len_lists)
def test_my_func(lists):
    x, y = lists
    assume(len(x) == len(y))

It's a little clumsy, and I'm not very happy about having to unpack the lists inside the test body.

score 1 · Answer 3 · answered Nov 19 '20 at 06:28

The other solutions give nice reusable strategies. Here's a short low-tech solution, perhaps better suited to one-off use since you need to do one line of processing in the test function. We use zip to tranpose a list of pairs (2-element tuples); conceptually we're turning a n x 2 matrix into a 2 x n matrix.

import hypothesis.strategies as st
from hypothesis import given

pair_lists = st.lists(st.tuples(st.integers(), st.integers()), min_size=1)

@given(pair_lists)
def test_my_func(L):
    x, y = map(list, zip(*L))

Warning: It is crucial to have min_size=1 because zip will give nothing if the list is empty.

Python hypothesis: Ensure that input lists have same length

3 Answers3