In addition to --cache_test_results
, Bazel actually has a flag specifically designed for diagnosing flaky tests: --runs_per_test
, which will rerun a test N times and only keep logs from the failing runs:
$ bazel test --runs_per_test=10 :flaker
INFO: Found 1 test target...
FAIL: //:flaker (run 10 of 10) (see /output/testlogs/flaker/test_run_10_of_10.log).
FAIL: //:flaker (run 4 of 10) (see /output/testlogs/flaker/test_run_4_of_10.log).
FAIL: //:flaker (run 5 of 10) (see /output/testlogs/flaker/test_run_5_of_10.log).
FAIL: //:flaker (run 9 of 10) (see /output/testlogs/flaker/test_run_9_of_10.log).
FAIL: //:flaker (run 3 of 10) (see /output/testlogs/flaker/test_run_3_of_10.log).
Target //:flaker up-to-date:
bazel-bin/flaker
INFO: Elapsed time: 0.828s, Critical Path: 0.42s
//:flaker FAILED
Executed 1 out of 1 tests: 1 fails locally.
You can use it to quickly figure out how flaky a test is and get some failing logs.