Flaky Tests & Reliability

Flaky test detection

Flaky test detection is the practice of identifying tests that fail intermittently by analyzing their pass/fail history across many runs, rather than from a single result.

A single run cannot prove a test is flaky — you need history. Detection systems record the outcome of every test on every run, keyed by test identity and commit, then flag tests that flip between pass and fail without a corresponding code change. The more runs in the history, the more confident the signal.

Detection is the prerequisite for everything else: you cannot quarantine, prioritize, or fix what you cannot see. Qualflare builds this history automatically from the results you already upload from CI and scores each test’s flakiness over time.

Requires historical pass/fail data across runs, not one execution.
Flags tests that flip outcome without a code change.
Prerequisite for quarantine, prioritization, and root-cause fixing.

Learn more

Sources

Google Testing Blog — Flaky Tests at Google

See it in your own test results

Qualflare detects flaky tests, clusters failures by root cause, and scores release risk from the test results you already produce in CI. Start free.

Start free with Qualflare

← Back to the testing & observability glossary.

Last reviewed June 11, 2026

Flaky test detection

Learn more

Related terms

Sources

See it in your own test results