qa-role-ai
AI/ML & data-pipeline QA role bundle: one-command install of LLM evaluation, ML model testing, AI-assisted test generation, search relevance, data notebooks, and data quality.
Install this role bundle
/plugin install qa-role-ai@testland-qaOne command installs all 6 member plugins. Requires Claude Code v2.1.110+ (v2.1.143+ to enable the whole set together).
AI/ML & data-pipeline QA
AI/ML & data-pipeline QA role bundle: one-command install of LLM evaluation, ML model testing, AI-assisted test generation, search relevance, data notebooks, and data quality.
Installing this one plugin installs all 6 member plugins below in a single command.
Install
/plugin marketplace add testland/qa
/plugin install qa-role-ai@testland-qaClaude Code resolves and installs the member plugins automatically and lists what it added. Requires Claude Code v2.1.110+ (v2.1.143+ to enable the whole set together).
What this installs
About role bundles
This is a role bundle - a plugin that ships no skills or agents of its own. It exists only to install a curated set of testing plugins together so you adopt a whole role in one command instead of installing each plugin by hand. Prefer a narrower set? Install just the member plugins you need individually.
Installs these 6 plugins
qa-llm-evaluation
LLM and prompt evaluation: 7 skills (deepeval-evaluation, giskard-llm, langfuse-tracing, llm-regression-suite-author, openai-evals, promptfoo-evaluation, ragas-evaluation) and 2 agents (llm-red-team-planner, prompt-eval-reviewer). Covers the mainstream OSS LLM-eval ecosystem: Promptfoo + OpenAI Evals + DeepEval + Ragas for functional eval, Giskard for adversarial scan, Langfuse for production observability.
qa-ml-models
ML model testing: 6 skills (alibi-explainability, deepchecks-tests, evidently-monitoring, fairlearn-fairness, giskard-tests, model-performance-regression-gate) and 2 agents (data-drift-incident-responder, model-fairness-reviewer). Covers vulnerability scanning, drift monitoring, group fairness, and per-prediction explainability.
qa-ai-assisted
AI-assisted test generation + curation: 3 skills (ai-spec-coverage-mapper, ai-test-generator, model-based-test-graph-author) and 3 agents (ai-test-curator, ai-test-shallow-coverage-critic, mbt-suite-builder).
qa-search-relevance
Search relevance testing: 6 skills (elasticsearch-relevance-tests, hybrid-search-eval-author, judgment-list-author, opensearch-relevance-tests, solr-relevance-tests, vector-search-precision-tests) and 1 agent (relevance-regression-reviewer). IR-metrics-driven NDCG / MRR / Recall@k regression detection.
qa-data-notebooks
Jupyter notebook testing: 4 skills (nbval-tests, notebook-ci-pipeline-author, papermill-tests, testbook-tests) and 1 agent (notebook-quality-reviewer). Covers full-notebook regression (nbval), function-level unit tests (testbook), and parameterized execution (papermill).
qa-data-quality
Data quality testing for analytical pipelines: 5 skills (dbt-testing, great-expectations, soda-checks, data-quality-gate, data-quality-conventions) and 2 agents (schema-diff-reviewer, data-anomaly-triager).