What does Flaky: Hypothesis test produces unreliable results mean?

Question

I am using the hypothesis python package for testing.

I am getting the following error:

Flaky: Hypothesis test_visiting produces unreliable results: Falsified on the first call but did not on a subsequent one

As far as I can tell, the test is working correctly.

How do I get around this?

score 13 · Accepted Answer · answered Aug 02 '15 at 08:32

13

It means more or less what it says: You have a test which failed the first time but succeeded the second time when rerun with the same example. This could be a Hypothesis bug, but it usually isn't. The most common cause of this is that you have a test which depends on some external state - e.g. if you're using a system random number generator rather than a Hypothesis provided one, or if your test creates some files and only fails if the files did not exist at the start of the test. The second most common cause of this is that your failure is a recursion error and the example which triggered it at one level of function calls did not at another.

You haven't really provided enough information to say what's actually happening, so it's hard to provide more specific advice than that. If you're running a recent version of Hypothesis (e.g. 1.9.0 certainly does it) you should have been given quite detailed diagnostics about what is going on - it will tell you what the original exception you got was and it will report if the values passed in seemed to change between calls.

answered Aug 02 '15 at 08:32

DRMacIver

2,259
1
17
17

How did the test fail? There were no failing assertions in test. My test is being provided the same arguments again and when these are provided a 2nd time, a different path is taken inside the test. But it does not make any failing assertions. – sureshvv Aug 02 '15 at 08:54
Any exception will cause a failure, not just assertions. Like I said: You should be being shown the stack trace of the exception that caused your original failure. It's very hard to diagnose what's going on if you don't provide any relevant information. – DRMacIver Aug 02 '15 at 09:08
There was no stack trace printed. The only error that was displayed was the Flaky error. May be some additional information could be displayed when stated error occurs. – sureshvv Aug 03 '15 at 06:27
1

My test is not idempotent with respect to the inputs and works differently when the same input is presented a second time. May be this has something to do with it. – sureshvv Aug 03 '15 at 06:35
1

ok. Then yes, that's exactly what's happening. Like I said. This error is caused by a test that does not do the same thing when run twice. If you're not getting a stack trace then you're running an old version of Hypothesis. – DRMacIver Aug 04 '15 at 07:03
@DRMacIver what’s your recommendation in case of a function that’s not idempotent (e.g. your file-created example above)? In that case, my function throws an exception—so _that_ particular exception should then be considered as passing behavior? – Jens Dec 30 '22 at 06:25
My recommendation is that regardless of whether your functions are idempotent, your tests should be! You should always run the test from a clean state, so if e.g. your function creates a file you should make sure that that file is deleted at the beginning of the tests if it exists, or always run it in a clean directory. – DRMacIver Dec 31 '22 at 14:07

score 8 · Answer 2 · answered Nov 25 '19 at 23:44

8

One thing that I haven't seen mentioned a lot, and it might be a relatively new behavior, is that you may want to raise the deadline of your tests. In my experience, if one test case fails due to a missed deadline and the second one passes, you'll see it as a "flaky" test failure.

@hypothesis.settings(deadline=500)

It's been hard for me to find some proper documentation about this behavior that I could personally understand fully, but this seems to fix it for me.

answered Nov 25 '19 at 23:44

damd

6,116
7
48
77

This is a good point, but I seem to recall that it actually mentions the deadline failure as the reason for the flaky failure. – Matthew Schinckel Mar 15 '21 at 03:54

What does Flaky: Hypothesis test produces unreliable results mean?

2 Answers2

Linked