Testland
Browse all skills & agents

flag-state-coverage-builder

Workflow-driven skill that builds a flag-state coverage matrix from the project's flag inventory and risk register. Walks through: inventorying flags (grep for flag-evaluation calls), classifying each (boolean / multi-variant / kill-switch / experiment), choosing the coverage strategy (per-flag-isolation / pairwise / full / risk-driven per feature-flag-test-matrix-reference), generating the test matrix (PICT for pairwise; manual for risk-driven), and emitting test skeletons. Use when introducing flag-test coverage to a new codebase or when a flag-related incident exposes a coverage gap. Composes feature-flag-test-matrix-reference.

flag-state-coverage-builder

Overview

Building a flag-state coverage matrix from scratch is hard because the combinatorics explode. This skill walks through producing a realistic coverage matrix - not exhaustive, but sufficient.

The output: a coverage-matrix YAML + per-cell test skeletons + gaps documented for follow-up.

When to use

  • New codebase adopting feature flags; no test coverage yet.
  • A flag-related incident exposed a coverage gap; need to catch up.
  • Adopting a new flag platform; existing tests need re-pointing.
  • Periodic audit of flag-test coverage.

Step 1 - Inventory flags

Grep for SDK calls:

# Generic
grep -rn 'isOn\|isEnabled\|variation\|getFeatureValue' --include='*.{ts,js,py,go,java}' .

# Per-platform
grep -rn 'launchdarkly\|ld_client' .         # LD
grep -rn 'unleash.isEnabled' .                # Unleash
grep -rn 'flagsmith.get_' .                   # Flagsmith
grep -rn 'gbClient.\|growthbook' .            # GrowthBook

Output: a flag inventory:

flags:
  - name: show-new-ui
    platform: launchdarkly
    type: boolean
    found_at:
      - src/components/Header.tsx:42
      - src/pages/Dashboard.tsx:88
  - name: checkout-experiment
    platform: launchdarkly
    type: multi-variant
    variants: [control, treatment-a, treatment-b]
    found_at:
      - src/pages/Checkout.tsx:120
  # ...

Step 2 - Classify each flag

CategorySignalsCoverage need
Kill-switchNaming: *-kill, disable-*, emergency-*Test on→off toggle latency
ExperimentMulti-variant, used in analyticsPer-variant test + assignment integrity
Permission-gated featureUsed with if(flag && user.role===...)Test per (flag, role) cell
UI tweakUsed in JSX/template; no business logicDefault + each variant; low risk
MigrationNaming: use-new-*, migrate-to-*Test both paths to verify equivalence
Plan / tier gatingUsed with subscription / plan checkPer (flag, plan) cell

Step 3 - Choose coverage strategy per category

Per feature-flag-test-matrix-reference:

StrategyApply to
Default-only smokeUI tweaks (low risk)
Per-flag isolationMigration flags
PairwisePermission-gated + plan-tier (interactions matter)
Full matrixKill-switches + flags with regulatory impact
Risk-drivenCatch-all for the rest

Step 4 - Generate the matrix

For pairwise: use PICT (Microsoft):

# pict.txt
flag_a: on, off
flag_b: on, off
flag_c: control, treatment-a, treatment-b
user_segment: free, paid, enterprise

pict pict.txt > matrix.tsv

PICT emits a pairwise-covering matrix (≤ 12 tests instead of 24 for full).

For risk-driven: combine with risk register from qa-process/risk-matrix. Cells with high impact + high likelihood become required tests.

Step 5 - Emit per-cell test skeleton

For each cell of the matrix, generate a test stub:

// tests/feature-flags/auth.test.ts
describe('auth flag matrix', () => {
  beforeEach(() => {
    td.update(td.flag('use-new-auth').booleanFlag().on(false));
  });

  test('free user, new auth off → old flow', () => {
    td.update(td.flag('use-new-auth').booleanFlag().on(false));
    expect(authFlow({ plan: 'free' })).toBe('old');
  });

  test('free user, new auth on → new flow', () => {
    td.update(td.flag('use-new-auth').booleanFlag().on(true));
    expect(authFlow({ plan: 'free' })).toBe('new');
  });

  test('paid user, new auth on → new flow', () => {
    td.update(td.flag('use-new-auth').booleanFlag().on(true));
    expect(authFlow({ plan: 'paid' })).toBe('new');
  });

  // ... per pairwise matrix
});

The platform-specific SDK setup comes from launchdarkly-testing etc.

Step 6 - Special category tests

Add these regardless of matrix coverage:

Kill-switch deactivation latency

test('kill-switch deactivation propagates within 30s', async () => {
  td.update(td.flag('emergency-disable').booleanFlag().on(false));
  expect(featureActive()).toBe(true);

  td.update(td.flag('emergency-disable').booleanFlag().on(true));
  // SDK may have polling delay; in test mode it's instant
  expect(featureActive()).toBe(false);
});

Default-on-error

test('SDK fails → default returned', async () => {
  const brokenClient = simulateSDKFailure();
  expect(brokenClient.boolVariation('any-flag', user, false)).toBe(false);
  expect(brokenClient.boolVariation('any-flag', user, true)).toBe(true);
});

Sticky-assignment

test('user assignment sticky across sessions', () => {
  const v1 = client.variation('rollout', { key: 'user-1' });
  const v2 = client.variation('rollout', { key: 'user-1' });
  expect(v1).toEqual(v2);
});

Step 7 - Document coverage + gaps

Emit a coverage doc:

# Flag-Test Coverage Matrix

## Covered cells

| Flag | Strategy | Cells | Test file |
|---|---|---|---|
| show-new-ui | per-flag isolation | 2 | tests/flags/show-new-ui.test.ts |
| checkout-experiment | pairwise (3 flags) | 9 | tests/flags/checkout-pairwise.test.ts |
| auth-migration | full matrix (2 flags × 3 plans) | 6 | tests/flags/auth.test.ts |

## Documented gaps (deliberate)

| Cell | Reason | Mitigation |
|---|---|---|
| flag-x = on AND flag-y = on AND user.segment = `internal` | Low likelihood — internal users only see flag-y in beta | Manual verify on flag-y promotion |
| theme-tweak all 3 variants × all 5 segments | UI-only; default-on-each is sufficient | None |

Anti-patterns

Anti-patternWhy it failsFix
Build matrix without inventoryFlags missed silentlyAlways grep first
Pairwise on truly-independent flagsWasted testsIdentify interactions; pair-test only interacting flags
Full matrix on 20+ flags2^20 tests; infeasiblePairwise or risk-driven
Don't document gapsFuture maintainers don't knowCoverage doc with gaps + reason
One mega-test file for all flagsFailures opaqueOne file per flag (or flag-pair)
Skip platform-specific override-modeTests pass against mock; prod-SDK-specific bugs hideUse platform's TestData/bootstrap
Skip kill-switch test"It worked in dev"Always test
Coverage matrix not committed / no reviewDrift unnoticedMatrix.yaml in repo

Output

This skill produces:

  • A flag inventory (Step 1).
  • A coverage matrix (Step 4) committed as flag-coverage.yaml.
  • Per-cell test skeletons (Step 5).
  • Special-category tests (Step 6).
  • A coverage doc with explicit gaps (Step 7).

References