Why does my simple, finite hypothesis test never stop?

Question

I am running a test suite with hypothesis-4.24.6 and pytest-5.0.0. My test has a finite set of possible inputs, but hypothesis never finishes testing.

I have reduced it to the following minimal example, which I run as pytest test.py

from hypothesis import given
import hypothesis.strategies as st


@given(x=st.just(0)
         | st.just(1),
       y=st.just(0)
         | st.just(1)
         | st.just(2))
def test_x_y(x, y):
    assert True

I would expect it to try all six combinations here and then succeed. Or possibly a small multiple of that to check for flakiness. Instead it runs indefinitely, (after about 15 mins of testing I kill it.)

If I interrupt the test, back traces seem to show it just continuously generating new examples.

What have I done wrong here?

This is clearly a regression that first occured with the 4.10 release. With the version 4.9, the test passes in 0.12 seconds, while with 4.10 and later it doesn't stop in >1 minute. I would create a [new issue](https://github.com/HypothesisWorks/hypothesis/issues/new), or summon [Zac Hatfield-Dodds](https://stackoverflow.com/users/9297601/zac-hatfield-dodds) and ask him. — hoefling, Jul 01 '19 at 09:29
I'm still looking into this, but it's clearly a bug! Future details will be posted at https://github.com/HypothesisWorks/hypothesis/issues/2027 — Zac Hatfield-Dodds, Jul 02 '19 at 01:15

301_Moved_Permanently · Answer 1 · 2019-07-01T09:33:14.710

This seems to be connected to the amount of successful tests hypothesis tries to generate:

>>> from hypothesis import given, strategies as st
>>> @given(st.integers(0,1), st.integers(0,2))
... def test(x, y):
...   print(x, y)
...   assert True
... 
>>> test()
0 0
1 1
1 0
1 2
1 1
0 1
0 0
1 2
0 2
0 2
1 0
1 2
0 1
0 1
1 2
[snip…]

See, this part of the docs, for instance, the default amount of successful test cases should be 100. So trying to generate more and more data to only restrict to 6 cases is rapidly failing to find one of these 6 cases.

The simplest approach can be to just limit the amount of examples needed for this test to pass:

>>> from hypothesis import settings
>>> @settings(max_examples=30)
... @given(st.integers(0,1), st.integers(0,2))
... def test(x, y):
...   print(x, y)
...   assert True
... 
>>> test()
0 0
1 1
1 0
0 2
1 2
0 1
0 1
1 1
1 0
1 1
0 1
1 2
1 1
0 0
0 2
0 2
0 0
1 2
1 0
0 1
1 0
1 0
0 1
1 2
1 1
0 2
0 0
1 2
0 0
0 2

An other approach, given the few amount of test cases, would be to explicit them all using @example and ask hypothesis to only run those explicit examples:

>>> from hypothesis import given, example, settings, Phase, strategies as st
>>> @settings(phases=(Phase.explicit,))
... @given(x=st.integers(), y=st.integers())
... @example(x=0, y=0)
... @example(x=0, y=1)
... @example(x=0, y=2)
... @example(x=1, y=0)
... @example(x=1, y=1)
... @example(x=1, y=2)
... def test(x, y):
...   print(x, y)
...   assert True
... 
>>> test()
0 0
0 1
0 2
1 0
1 1
1 2

Also note that st.just(0) | st.just(1) is equivalent to st.one_of(st.just(0), st.just(1)) so choose an approach and stick to it, but don't mix them.

Thanks for that last tip, I have made the example simpler. The problem seems likely a bug tho, since this actually works as expected in 4.9 — tahsmith, Jul 01 '19 at 10:34

Zac Hatfield-Dodds · Accepted Answer · 2019-07-07T19:05:19.373

1

This bug was fixed in Hypothesis ~~4.26.2~~, or at least we thought so; it's actually fixed in 4.26.3 though: https://hypothesis.readthedocs.io/en/latest/changes.html#v4-26-3

edited Jul 07 '19 at 19:05

answered Jul 04 '19 at 19:37

Zac Hatfield-Dodds

2,455
6
19

Just gave this a go: it does not seem to work with my example yet. I will add details to the github issue. – tahsmith Jul 05 '19 at 01:50
1

Thanks for following this up - it turns out that we fixed a masking bug, and needed a *second* fix for your actual problem. But it's fixed now, as far as I know! – Zac Hatfield-Dodds Jul 07 '19 at 19:06

Why does my simple, finite hypothesis test never stop?

2 Answers2