Testland
Browse all skills & agents

framework-choice-advisor

Pure reference catalog for picking a test automation framework - covers Playwright / Cypress / Selenium / WebdriverIO / Appium / Espresso / XCUITest / RestAssured / Karate / k6 / Locust with side-by-side tradeoffs on speed, cross-browser, mobile, parallelisation, language support, ecosystem maturity, CI integration; a decision tree for matching project NFRs to framework choice; and reference directory / fixture / CI layouts for the chosen stack. Distinct from the per-framework skills (`playwright-testing`, `cypress-testing`, etc.) which document configuration once a framework is chosen - this skill is the **upstream selection step**. Distinct from `test-pyramid-balancer` (which tunes the layer mix for an existing suite). Use when starting a new test-automation suite from scratch, before installing any tool.

framework-choice-advisor

Overview

A team is starting a new test automation suite and needs to pick the stack: framework, runner, assertion library, reporter, CI integration, fixture system, parallelisation strategy, retry policy. Most "AI for testing" tooling pretends to scaffold the whole framework for you in one shot - per the 2025 World Quality Report, this is exactly the integration-friction failure mode (37% of teams cite integration friction as the dominant AI-in-testing blocker). The honest deliverable is decision support, not auto-scaffolding.

This skill is a pure reference: a decision tree + tradeoff matrix the team uses as a checklist. It does not generate framework boilerplate. After the team picks a stack, the per-framework skills (playwright-testing, cypress-testing, etc.) document the configuration; this skill stops at "you picked Playwright + Jest, here's the canonical directory layout to use".

When to use

  • Greenfield: starting a new test automation suite from zero.
  • Migration: the team is moving off a legacy framework (most commonly Selenium to Playwright, whose auto-waiting removes the manual-wait flakiness that dominates Selenium suites, per the Playwright actionability docs).
  • Multi-stack consolidation: the team has three frameworks across product areas and is deciding which to standardise on.
  • Hiring-driven re-evaluation: the team's skills mix shifted (e.g., from Java to TypeScript) and the framework choice should follow.

Do not use this skill when:

  • The framework is already chosen and the team needs configuration / API help - use the per-framework skill (playwright-testing, cypress-testing, etc.).
  • The team's question is "should we add E2E vs unit vs contract tests" - that's test-pyramid-balancer.
  • Mobile-native vs hybrid app selection - use the qa-mobile plugin's documentation directly.

Step 1 - Frame the decision against the project's NFRs

Six NFR axes drive framework choice. Score each 1 - 5 for the project; rank them by priority. The framework matrix in Step 2 uses these scores.

NFR axisQuestion
Cross-browser scopeIs multi-browser execution required? (Chromium-only? + Firefox + WebKit? + IE/Edge legacy?)
Mobile scopeReal device + emulator? Hybrid app webview? Native-only? Or web-mobile-viewport only?
Team languageWhat languages do the engineers already know? (Avoiding the framework-language mismatch is the #1 maintenance cost.)
Execution speedParallel-shard target - minutes for the full suite. CI-cost-driven?
Ecosystem maturityThird-party integrations the team needs (visual regression, accessibility, perf, contract).
Hire-abilityCan the team hire engineers familiar with the framework? Smaller frameworks → smaller talent pool.

Step 2 - Framework tradeoff matrix (web E2E)

FrameworkCross-browserMobileLanguageSpeed (parallel)EcosystemHire-abilityNotes
PlaywrightChromium / Firefox / WebKit native, all in one runtimeMobile-viewport emulation + real device via Playwright Mobile (beta)TS / JS / Python / .NET / JavaExcellent (auto-parallel, sharding built-in)Strong (trace viewer, visual snapshots, fixtures, MCP integration)High (fastest-growing 2024-26)A common default for greenfield web E2E; its built-in auto-waiting removes the manual-wait flakiness that dominates Selenium suites (Playwright actionability).
CypressChromium-family + Firefox + WebKit (newer)Mobile viewport only; no real-deviceJS / TS onlyGood (parallel via Cypress Cloud; CLI-parallel limited)Strong (huge plugin ecosystem)HighStrong DX for component testing; runs inside-browser limits cross-origin and iframe scenarios.
Selenium / WebdriverIOAll browsers via WebDriver protocolReal device via AppiumAll major languages (Java / Python / C# / JS / Ruby)Moderate (Selenium Grid; WDIO improves on Selenium's runner)Mature (oldest ecosystem)Highest (historical talent pool)Mature but more flake-prone than Playwright: async-wait is the single largest flake category at 45% (Luo et al. 2014), and Selenium leaves that synchronization manual. Migration target, not greenfield default.
TestCafeAll browsers; proxy-based (no WebDriver)Mobile via emulatorsJS / TSModerateSmaller ecosystemLowerNiche; integrated runner.
PuppeteerChromium-only natively (Firefox via experimental)LimitedJS / TSGoodSmaller than PlaywrightLowerMostly superseded by Playwright (the team that built Puppeteer started Playwright).

The 2026-recommendation tree for greenfield web E2E:

  • Multi-browser required + multi-language team → Playwright.
  • Single-browser (Chromium) + JS-only team → Playwright or Cypress; Cypress has stronger component-testing DX.
  • Legacy / migration off existing Selenium suite → either incremental migration to Playwright (preferred) or modernise the Selenium suite via WebdriverIO's runner ergonomics.

Step 3 - Framework tradeoff matrix (other test layers)

Mobile native

FrameworkPlatformLanguageNotes
EspressoAndroid nativeKotlin / JavaGoogle's first-party. In-process, fast, deterministic.
XCUITestiOS nativeSwift / Obj-CApple's first-party. In-process.
AppiumiOS + Android (and others)All majorCross-platform unifier; trades depth for breadth. WebDriver-based - same flake patterns as Selenium.
DetoxReact NativeJS / TSRN-specialist; grey-box testing.

Decision: if the team is single-platform native (iOS only or Android only), use the first-party framework. Cross-platform → Appium, accept the WebDriver flake-tax. React-Native specifically → Detox.

API / contract

FrameworkScopeLanguageNotes
RestAssuredREST API integration testsJava / KotlinThe JVM-default; mature, deeply integrated with JUnit / TestNG.
KarateREST + SOAP + GraphQL + gRPCKarate DSL (Cucumber-like)DSL-first; lowers barrier for non-Java testers.
schemathesisOpenAPI / GraphQL property-based fuzzingPythonGenerative; complements example-based tests. See contract-test-scaffolder.
PactConsumer-driven contract testsJS / JVM / Python / Go / Ruby / .NETDifferent category - contract, not integration. See pact-contract-testing.
Postman / NewmanCollection-driven API testsPostman DSLUI-driven authoring; not code-first. Often used by non-engineers.

Performance

FrameworkScopeLanguageNotes
k6Load + perf, code-firstJS (with TS support)Grafana's; lowest barrier for engineers, excellent CI integration.
LocustLoad + perf, code-firstPythonOpen-source; user-class-based modelling.
JMeterLoad + perf, GUI-firstXML configMature, ecosystem-heavy; GUI-driven authoring is the trade-off.
GatlingLoad + perf, code-firstScala / Java / KotlinHigh-throughput; JVM stack.

Step 4 - Reference directory layouts

After the team has chosen a stack, this skill provides the canonical directory layout the per-framework skill assumes. Layouts are conventions, not mandates - every project has reasons to deviate, but the canonical layout is the starting point a newcomer can read.

Playwright + Jest (TypeScript) - the 2026 default for web E2E

tests/
├── e2e/
│   ├── auth/
│   │   ├── login.spec.ts
│   │   └── login.fixture.ts
│   ├── cart/
│   │   ├── add-item.spec.ts
│   │   └── checkout.spec.ts
│   └── pages/                  # Page Objects (per Martin Fowler's pattern)
│       ├── LoginPage.ts
│       ├── CartPage.ts
│       └── CheckoutPage.ts
├── helpers/
│   ├── api-client.ts            # HTTP client for setup / teardown
│   ├── test-data.ts             # Fixtures and seeds
│   └── selectors.ts             # Shared accessibility-first locators
├── fixtures/                    # Static test data
├── playwright.config.ts
├── tsconfig.json
└── package.json

Conventions:

  • One *.spec.ts per feature flow; one Page Object per page or major component.
  • Fixtures scoped to describe blocks; global fixtures are an anti-pattern (see test-code-conventions §6).
  • Page Objects per Martin Fowler's definition: "a page object wraps an HTML page... with an application-specific API." Page Objects do not make assertions; they return state or the next Page Object on navigation.

Cypress + Mocha (TypeScript)

cypress/
├── e2e/
│   ├── auth/login.cy.ts
│   └── cart/checkout.cy.ts
├── support/
│   ├── commands.ts              # Custom Cypress commands
│   ├── pages/                   # Page Objects (Cypress idiom: command-based, not class-based)
│   └── e2e.ts
├── fixtures/
├── cypress.config.ts
└── package.json

Cypress idiom prefers custom commands over class-based POMs; the directory layout reflects that.

Selenium / WebdriverIO (TypeScript or Java)

test/
├── specs/
│   ├── auth/login.spec.ts
│   └── cart/checkout.spec.ts
├── pageobjects/
│   ├── login.page.ts
│   └── cart.page.ts
├── helpers/
├── wdio.conf.ts
└── package.json

WDIO's runner ergonomics improve on raw Selenium; the layout is conventional.

Step 5 - CI integration patterns

Universal across frameworks:

ConcernConvention
ParallelisationShard by file (Playwright --shard=X/Y, Cypress Cloud, WDIO maxInstances). Aim for 5 - 10 minute wall-clock for the full suite per shard.
RetriesRetry once on first failure; never retry locally (only CI). Tests retried >1× are flake candidates - feed to failure-classifier.
Trace / videoCapture on-first-retry (off for green runs to save storage). Playwright trace: 'on-first-retry' is the default; Cypress + cypress-video-trim similar.
ReportingJUnit XML output for the CI's test-result panel; Allure for human reporting; both via plugin.
SecretsLoad from CI secret store (GitHub Actions Secrets, GitLab CI Variables); never commit.
Environment matrixOne job per (framework, browser, environment) cell; do not mix in one job.

Step 6 - When to defer the decision

The skill recommends deferring framework choice when:

  • The product surface is too new - no stable URL / API to test against.
  • The team's required languages aren't decided yet (hiring in progress).
  • The team is debating monolith vs micro-frontend; the framework choice depends on the product architecture.

In these cases, the right output is an explicit deferral note: "no decision today; revisit when (a)/(b)/(c) resolves."

Anti-patterns

Anti-patternWhy it failsFix
Picking the framework before the NFRs are scoredChoice driven by hype, not fit; high migration cost when the wrong framework can't deliver.Step 1 - score the NFRs first.
Standardising on one framework across every test layerDifferent layers need different tools (Playwright for E2E ≠ k6 for perf ≠ Pact for contract).Pick per layer; the stack is multiple frameworks.
Picking Selenium for greenfield in 2026Manual async-wait is the dominant flake category at 45% (Luo et al. 2014), and Selenium does not auto-wait.Use Playwright for greenfield; reserve Selenium for legacy maintenance.
Cross-language teams picking a single-language frameworkEngineers can't contribute; suite becomes one person's domain.Either pick a multi-language framework (Playwright / Selenium) or commit to retraining.
Adopting a framework because a contractor used itContractor leaves; team can't maintain.Hire-ability is an NFR.
Skipping the directory-layout conventionEvery newcomer authoring tests in a different shape; review burden grows.Step 4 - pick a canonical layout up front, even if you deviate later.
Treating this skill as "framework recommender" rather than "decision support"The skill recommends; the team decides. Automating the decision strips accountability.The output of this skill is a documented choice, not an automatic install.

Limitations

  • Coverage is the web / mobile / API / perf canonical set. Specialised layers (desktop apps via Spectron / Tauri, embedded devices, hardware-in-loop) are out of scope.
  • Tradeoff matrix is point-in-time. Frameworks evolve quickly; the 2026 ranking will be stale in 18 months. Re-read this skill before every greenfield decision.
  • Tradeoffs are illustrative. A specific product can have constraints that flip the recommendation - e.g., a Salesforce internal app where the legacy stack is JVM-only would override the "Playwright by default" guidance.
  • Hire-ability is geography-dependent. "Hire-ability" assumes US / EU markets in the matrix; for other markets, the team's local pool is the authority.
  • No automated framework scaffolding. Per the introduction, this is intentional - auto-scaffolded boilerplate is the dominant failure mode the research flags. Use the per-framework skill after the decision.
  • No closed-platform comparison. Mabl / Testim / Functionize / TestSigma are no-code platforms; they compete with this entire category, not with one framework. If the team is choosing "code vs no-code", that's a strategic decision beyond this skill's scope.

Hand-off targets

References

  • Playwright official documentation - locator hierarchy, trace viewer, parallel sharding: https://playwright.dev/
  • Cypress documentation: https://docs.cypress.io/
  • WebdriverIO documentation: https://webdriver.io/
  • Martin Fowler - Page Object pattern (canonical definition): https://martinfowler.com/bliki/PageObject.html
  • Luo et al., "An Empirical Analysis of Flaky Tests" (FSE 2014) - async-wait is the largest flake category (45%), the class Selenium leaves to manual synchronization: https://mir.cs.illinois.edu/marinov/publications/LuoETAL14FlakyTestsAnalysis.pdf
  • Playwright actionability docs - built-in auto-waiting and auto-retrying assertions that remove a major Selenium flake source: https://playwright.dev/docs/actionability
  • Capgemini World Quality Report 2025-26 - 37% cite integration friction as the dominant AI-in-testing blocker (justifies why this is decision-support, not auto-scaffolding): https://www.capgemini.com/insights/research-library/world-quality-report-2025-26/
  • ISTQB glossary - test automation framework: https://glossary.istqb.org/en_US/term/test-automation-framework
  • ISTQB glossary - keyword-driven testing (relevant to Karate / Postman DSL choice): https://glossary.istqb.org/en_US/term/keyword-driven-testing
  • ISO/IEC 25010 - quality characteristics (used in Step 1 NFR scoring): https://en.wikipedia.org/wiki/ISO/IEC_25010
  • playwright-testing, cypress-testing, selenium-testing, webdriverio-testing - downstream per-framework configuration skills.
  • test-pyramid-balancer - layer-mix tuning after the framework is in place.
  • framework-architecture-auditor - sibling agent for auditing an existing framework's architecture.