qa-flake-triage
Flake triage: 4 skills (flake-dashboard-author, flake-pattern-reference, flake-remediation-guide, flaky-test-quarantine) and 5 agents (ai-flake-detector, e2e-flake-bisector, e2e-test-trend-reporter, parallel-isolation-checker, regression-bisector).
Install this plugin
/plugin install qa-flake-triage@testland-qaPart of role bundles: qa-starter, qa-role-automation-engineer, qa-role-sdet
qa-flake-triage
Flake triage workflow: bisector, parallel-isolation checker, regression bisector, AI-pattern flake detector, trend reporter, and quarantine workflow.
Components
| Type | Name | Description |
|---|---|---|
| Skill | flaky-test-quarantine | Quarantine workflow: mark, annotate (rate + bisect link + expiry), auto-expiry report, two-renewal cap. |
| Skill | flake-pattern-reference | Reference catalog of 8 flake patterns (timing, ordering, shared state, leaks, network, locator, environment, randomness) with detection signals + remediation. |
| Agent | e2e-flake-bisector | Vary one axis at a time (worker count, random order, network throttle, viewport, animations, OS, sequential reps) over N runs to localize the flake source. |
| Agent | parallel-isolation-checker | Find shared state two workers collide on: DB row, schema, file path, port, env-var, module singleton, browser context - with file:line evidence. |
| Agent | regression-bisector | git bisect run orchestrator: build the test script, mark good/bad, handle exit-125 skips, report the introducing commit. |
| Agent | ai-flake-detector | Predictive screen: ranks currently-green tests by flakiness risk (passing→flaky transitions, duration variance, fixed-sleep patterns, cross-suite ordering). |
| Agent | e2e-test-trend-reporter | Weekly / monthly suite health report with week-over-week deltas (pass rate, flakiness rate, top failures, time-to-green, quarantine count). |
| Skill | flake-dashboard-author | Build a persistent flakiness dashboard from run history (Grafana / Datadog CI Visibility). |
| Skill | flake-remediation-guide | Per-pattern code fixes for each flake class cataloged in flake-pattern-reference. |
Install
/plugin marketplace add testland/qa
/plugin install qa-flake-triage@testland-qaSkills
flake-dashboard-author
Builds a persistent flakiness infrastructure dashboard from JUnit XML or JSON CI run history: defines the flake-rate metric (failures per test over a configurable window), authors the data model, generates a Grafana time-series panel JSON or configures a Datadog CI Visibility view, derives the quarantine-candidate query, and wires trend alerts. Use when a team needs a long-lived observability surface for test reliability that outlasts any single weekly report.
flake-pattern-reference
Reference catalog of flake patterns - async/timing, test ordering, shared parallel state, resource leaks, network, locator drift, environment variance, randomness - with detection heuristics and remediation per pattern. Use when triaging an unknown flake to identify the category before bisecting.
flake-remediation-guide
Provides concrete code-level fixes for each of the eight recurring flake patterns cataloged in flake-pattern-reference: replacing fixed sleeps with framework auto-waits, isolating state in beforeEach fixtures, adopting stable role-based locators, mocking network and clock, seeding RNG, closing leaked resources, and the Pattern 3 shared-parallel-state fix (per-worker DB schema via workerIndex). Use when a flake has already been classified by pattern and the engineer needs the specific code change to apply. Distinct from parallel-isolation-checker, which detects shared-parallel-state problems rather than applying the fix.
flaky-test-quarantine
Builds a quarantine workflow for flaky tests - marks the test with the framework's skip/fixme/retry annotation, records the failure-rate observation and a bisect link in the annotation body, sets an auto-expiry date, and produces a CI report listing every quarantined test that has expired and needs re-evaluation. Use when a flaky test is blocking the trunk and must be removed from the gating path without losing track of it.
Agents
ai-flake-detector
Reads historical CI test results (JUnit XML or vendor JSON) and predicts which currently-green tests are likely to go flaky next, using signals from the 8-pattern catalog (test size correlation, async waits with fixed sleeps, parallel-execution heuristics). Returns a ranked watchlist with rationale per test. Use proactively as a weekly screen across a large suite to focus prevention effort before the test starts failing.
e2e-flake-bisector
Runs a target end-to-end test N times under varied conditions (worker isolation, test order, viewport, network throttling, parallelism) to identify the axis along which the flake reproduces. Returns a probable root cause classified against the 8 flake patterns plus a numeric reproduction rate per axis. Use when a test has been flagged flaky and the team needs to know which condition triggers the failure.
e2e-test-trend-reporter
Generates a periodic (weekly / monthly) test-suite health report from CI history - total runs, suite duration, flakiness rate, top failing tests, time-to-green per PR, week-over-week deltas. Emits a markdown summary suitable for a team Slack channel or wiki page. Use as a scheduled CI job to keep test health visible.
parallel-isolation-checker
Inspects a test suite that flakes under parallel execution and identifies the specific shared state - DB rows, env vars, files, ports, lockfiles, or global module state - that workers are colliding on. Runs targeted instrumentation around suspect resources, correlates each test's writes with another worker's reads, and reports the colliding resource with file:line evidence. Use after `e2e-flake-bisector` has implicated parallel execution.
regression-bisector
Orchestrates `git bisect` against a target test or build script to identify the introducing commit of a regression. Wraps the bad/good marking, the `git bisect run` script, the 125 exit code for unbuildable revisions, and the final culprit report. Use when a test that previously passed has started failing 100% of the time on the trunk.