Skipping falsifying examples in Hypothesis

Question

The Story:

I'm currently in the process of unit-testing a function using hypothesis and a custom generation strategy trying to find a specific input to "break" my current solution. Here is how my test looks like:

from solution import answer

# skipping mystrategy definition - not relevant

@given(mystrategy)
def test(l):
    assert answer(l) in {0, 1, 2}

Basically, I'm looking for possible inputs when answer() function does not return 0 or 1 or 2.

Here is how my current workflow looks like:

run the test

hypothesis finds an input that produces an AssertionError:

$ pytest test.py 
=========================================== test session starts ============================================
...
------------------------------------------------ Hypothesis ------------------------------------------------
Falsifying example: test(l=[[0], [1]])

debug the function with this particular input trying to understand if this input/output is a legitimate one and the function worked correctly

The Question:

How can I skip this falsifying generated example ([[0], [1]] in this case) and ask hypothesis to generate me a different one?

The Question can also be interpreted: Can I ask hypothesis to not terminate if a falsifying example found and generate more falsifying examples instead?

I'm not familiar with hypothesis, but I do know it is halting because of the assert. If you just want to get around that temporarily, you can print a fail message, but not actually assert. Thus the framework will think it passed and keep going. — Kenny Ostrom, Sep 21 '16 at 17:41
@KennyOstrom yeah, but `hypothesis` is too good and fast at generating a ton of sample inputs. I can sort of workaround it having a set of inputs to ignore and add this "not in" check into the test itself, but this would not scale ..plus I am sure `hypothesis` has a built-in way to approach the problem, I'm just not familiar enough with the library at this point. Thanks! — alecxe, Sep 21 '16 at 17:43

score 7 · Accepted Answer · answered Sep 21 '16 at 17:56

7

There's at present no way to get Hypothesis to keep trying after it finds a failure (it might happen at some point, but it's not really clear what the right behaviour for this should be and it hasn't been a priority), but you can get it to ignore specific classes of failure using the assume functionality.

e.g. you could skip this example with:

@given(mystrategy)
def test(l):
    assume(l != [[0], [1]])
    assert answer(l) in {0, 1, 2}

Hypothesis will skip any examples where you call assume with a False argument and not count them towards your budget of examples it runs.

You'll probably find that this just results in trivial variations on the example, but you can pass more complex expressions to assume to ignore classes of examples.

What's your actual use case here? The normal intended usage pattern her would be to fix the bug that causes Hypothesis to fail and let it find new bugs that way. I know this isn't always practical, but I'm interested in why.

answered Sep 21 '16 at 17:56

DRMacIver

2,259
1
17
17

Gotcha, thanks again! Sure, this is a coding challenge (cannot provide the details though). I guess I am kind of brute-forcing here, but I know that if my solution returns a result that is not 0, 1, 2, it is probably a bug, but not necessarily..at the moment, I am trying to generate different inputs and manually go over how the function behaves hoping to spot a problem I cannot see otherwise. Will work with `assume()`, thanks. – alecxe Sep 21 '16 at 18:06
2

Have you tried using find() instead of given()? If you're interested in the values being produced rather than doing testing per se it might be a better tool – DRMacIver Sep 21 '16 at 18:16
1

I am the author of a related package for R which has this feature. The motivation was to have long running tests complete overnight and report all found errors. It's harder to motivate it when tests run very quickly at no additional cost. People will cringe at the thought of tests running overnight, but I was developing a big data library and some tests were ... big, plus there was a non trivial latency for each test. – piccolbo Jan 09 '17 at 20:14

Skipping falsifying examples in Hypothesis

1 Answers1