Flaky test detection
Flaky test detection finds the tests that pass and fail on the same code — the false signals that quietly destroy trust in your suite — by scoring each test's pass/fail history across runs, so you can quarantine and fix them before they block a release.
Why flaky tests are worth detecting
A flaky test produces different results on unchanged code — usually from non-determinism: timing and race conditions, shared or external state, network calls, animations, or test-ordering dependencies. Because the code under test did not change, a flaky failure tells you nothing reliable. Its real cost is trust: once a suite cries wolf often enough, teams start ignoring red builds — and a genuine regression slips through. Google has reported that roughly 16% of its tests exhibit some flakiness; see the data in flaky test statistics.
How flaky test detection works
You cannot tell a test is flaky from a single run — flakiness is a property of behavior over time. Detection therefore works from history: collect results from every CI run, track each test's outcome across them, and flag the tests whose results vary on unchanged code. Test-runner retries are the strongest raw signal — a test that fails and then passes on retry is flaky by definition. From enough history you can go further and assign a predictive flaky score, flagging unreliable tests before they disrupt a deploy rather than after.
- Collect history. Upload each CI run's results so every test has an outcome record across runs and branches — see flaky test detection.
- Score flakiness. Use retry signals and pass/fail history to compute a flake rate per test, suite, and pipeline.
- Quarantine. Move known-flaky tests out of the blocking path so they stop failing builds while you fix them — see test quarantine.
- Fix the root cause, then verify. Track flake rate over time to confirm the fix worked instead of muting the symptom.
Flaky test detection with Qualflare
Qualflare detects flaky tests from the results your CI already produces — no change to how your tests are written. It captures retry counts and flaky status automatically, scores flakiness across runs, and surfaces the worst offenders alongside failure clustering and per-launch risk. Follow the five-step, framework-agnostic setup guide, or start from the complete guide to flaky tests and how to detect flaky tests.
Framework-specific fixes: Cypress, Playwright, Jest, and pytest.
Detect flaky tests in your CI
Start free — upload your test results and Qualflare scores flakiness across runs automatically.
Get Started FreeFrequently asked questions
What is flaky test detection?
Flaky test detection is the practice of identifying tests that produce different results — sometimes passing, sometimes failing — on the same code, without any change to that code. It works by tracking each test’s pass/fail history across many runs rather than judging from a single result.
How do you detect a flaky test?
Re-run the test on the exact same commit. If it sometimes passes and sometimes fails with no code change, it is flaky. At scale you do this automatically: collect results from every CI run, track each test’s outcome history, and flag tests whose results vary on unchanged code. Test-runner retries are a strong raw signal — a test that fails then passes on retry is flaky by definition.
Why are flaky tests a problem?
Flaky tests erode trust. Once a suite cries wolf often enough, teams start ignoring red builds — and a real regression slips through. Google has reported that around 16% of its tests show some flakiness, and flaky failures account for a large share of the failures engineers investigate.
How do you fix flaky tests once detected?
Quarantine the known-flaky tests out of the blocking path so they stop failing builds, then fix the root cause — usually timing/race conditions, shared or external state, or test-ordering dependencies. Track flake rate over time so you can confirm the fixes are working rather than just muting symptoms.
Related: test observability · test reporting guides · testing & observability glossary · compare Qualflare to other tools
Written by İbrahim Süren, founder of Qualflare. Last reviewed June 2026.