How to set up flaky test detection in CI
To detect flaky tests automatically, have your test runner write a results file, upload it after every CI run, and let a tool score each test from its pass/fail history across builds. You cannot spot a flaky test from one run — detection needs history. Here is the five-step setup, framework-agnostic.
- 1
Make your test runner emit a results file
Flaky-test detection needs machine-readable output, not console logs. Configure your runner to write a results file: JUnit XML (pytest --junitxml, Jest jest-junit, JUnit Surefire) or your framework’s JSON reporter (Playwright, Cypress). You already have this if you publish test results in CI.
- 2
Install and authenticate the Qualflare CLI in CI
Add the Qualflare CLI to your pipeline and authenticate it once with an access token stored as a CI secret. No test code changes are required — the CLI is an extra step after your tests run.
- 3
Upload results after every test run
Add one command after your test step — qf <project> collect <results-file>. The CLI attaches your Git branch and commit, so every CI run becomes a tracked launch tied to the exact code it ran against.
- 4
Let history accumulate across builds
Flakiness can only be seen across runs, not from a single result. After a handful of builds on the same branches, Qualflare has enough pass/fail history per test to tell intermittent failures apart from genuine ones.
- 5
Review flaky scores and quarantine the worst offenders
Open the flaky-test view to see which tests flip outcome without code changes, ranked by how often. Quarantine the worst so they stop blocking releases, then fix the root cause — usually a timing, shared-state, or ordering issue — and let the score fall.
Framework specifics: Playwright, pytest, Cypress, Jest, and JUnit. New to the terms? See flaky test detection and test quarantine.
Frequently asked questions
Can you detect flaky tests from a single CI run?
No. A single run cannot prove a test is flaky — you need its pass/fail history across multiple runs on unchanged code. That is why detection works by accumulating results over several builds rather than judging one execution.
Do I need to change my test code to detect flaky tests?
No. Your runner already produces a results file (JSON or JUnit XML). You add one CLI step after the test run to upload it; detection happens from that history, with no test rewrites.
Should I just retry flaky tests instead?
Retries keep builds green but hide the signal — a test that only passes on retry is still flaky and may be masking a real intermittent bug. Detect and track flakes first, quarantine to stay unblocked, then fix root causes.
Detect flaky tests on your suite
Upload your CI results and let Qualflare score flaky tests from their history — free to start.
Start free with Qualflare