Testland
Browse all skills & agents

allure-reports

Configures Allure Report for the project (test-runner adapter installation, `allure-results` directory wiring, `categories.json` for failure classification, `history-trend.json` retention via the copy-history-between-runs pattern), runs the Allure CLI to convert `allure-results` to a static HTML site, and uploads the report as a CI artifact. Use when the team needs richer test reporting than JUnit XML - step-level attachments, per-test history, retry tracking, and severity / epic / feature labeling - across the framework-agnostic adapter ecosystem (pytest, Jest, JUnit, TestNG, NUnit, Mocha, etc.).

allure-reports

Overview

Allure Report is "an open-source framework-agnostic test result visualization tool" that "transforms test execution data into clear, interactive HTML reports" (allure-docs). It "supports testing frameworks across JavaScript, Python, Java, C#, PHP, and Ruby" with 30+ adapters (allure-docs).

Allure's two-stage workflow:

  1. Tests run with an Allure adapter installed. The adapter writes per-test JSON files into an allure-results/ directory.
  2. Allure CLI generates the static HTML site from allure-results/, optionally merging in history from a previous run for trend graphs.

The two artifacts are kept separate intentionally: results are runner-emitted; the report is a static deliverable.

When to use

  • The team needs richer reporting than JUnit XML - per-step attachments, screenshots on failure, history trends, severity / epic / feature labels, custom categorization of failures.
  • A multi-framework project (pytest + Jest + JUnit) needs one unified report instead of three separate ones.
  • An air-gapped or compliance-restricted environment needs static HTML output (vs SaaS test-management).

If the team only needs JUnit pass/fail and the CI already surfaces that, see junit-xml-analysis - that's a much lighter dependency.

Step 1 - Install the framework adapter

Each runner has a per-language adapter that writes to allure-results/. Examples:

FrameworkPackage / coordinateOutput dir env var (typical)
pytestpip install allure-pytest--alluredir=allure-results
Jestnpm i -D allure-jestoutputFolder config
Mochanpm i -D allure-mochaoutputFolder config
JUnit 5org.junit.platform:junit-platform-launcher + io.qameta.allure:allure-junit5allure.results.directory system property
TestNGio.qameta.allure:allure-testngallure.results.directory
NUnit / xUnitNuGet Allure.NUnit / Allure.XunitallureConfig.json directory

Adapter docs are at the framework-specific subpaths under allurereport.org/docs/ (allure-docs).

The output of running tests with the adapter installed is a directory of per-test JSON files (<uuid>-result.json, <uuid>-container.json, plus attachment-<uuid>.<ext> files for screenshots / logs).

Step 2 - Generate the static report

Install the Allure CLI (allure-docs linked from "Installation"):

# Node-based (cross-platform):
npm install -g allure-commandline
# Or via Scoop / Homebrew / apt — see allurereport.org/docs/ install pages.

Generate from the results directory:

allure generate allure-results --clean -o allure-report

--clean removes the previous allure-report/ before regenerating; -o sets the output dir.

To preview locally:

allure open allure-report

Or in one shot (useful in dev - runs a temp server with a freshly generated report):

allure serve allure-results

(See allurereport.org/docs/ for the CLI reference; specific flags have evolved across Allure 2.x and Allure 3 - pin a version in CI to avoid drift.)

Step 3 - Configure failure categories

Per allure-categories, categories.json placed in allure-results/ defines custom failure classification. Each entry matches by message regex, trace regex, and result statuses (failed, broken, passed, skipped, unknown):

[
  {
    "name": "Ignored tests",
    "messageRegex": ".*ignored.*",
    "matchedStatuses": ["skipped"]
  },
  {
    "name": "Infrastructure problems",
    "messageRegex": ".*RuntimeException.*",
    "matchedStatuses": ["broken"]
  }
]

Per allure-categories, category objects also support traceRegex (matches against stack trace) and flaky (boolean) to mark a category as flaky-by-default.

"The array order determines the matching sequence Allure applies when categorizing test results." (allure-categories)

So order from most-specific to least-specific. A "DB connection refused" category should precede a generic "Infrastructure" category.

Step 4 - Wire history retention

Per allure-history, history is what drives the trends panel ("first failed", "last passed", flake / retry rate over time):

"Allure Report uses a history ID to link test results across multiple runs. This identifier is automatically calculated based on the test's fully-qualified name and its parameters."

For Allure 2 (the broadly-deployed version), enabling history is manual (allure-history):

  1. Generate your initial report in the allure-report directory
  2. Remove the allure-results directory
  3. Run tests
  4. Copy allure-report/history subdirectory to allure-results/history
  5. Generate the next report

In CI, that becomes:

# Before running tests
mkdir -p allure-results
if [ -d previous-allure-report/history ]; then
  cp -r previous-allure-report/history allure-results/
fi

# Run tests (writes to allure-results/)
npm test

# Generate
allure generate allure-results --clean -o allure-report

# Save the history slice for next time
# (Upload allure-report/ as an artifact; download into previous-allure-report/ on next run.)

Per allure-history: "If you follow such a routine regularly, Allure will each time keep data from up to 20 latest reports in allure-results/history."

For Allure 3 (allure-history):

"Allure 3 simplifies this by using a single JSONL history file. You configure it in your settings file with a historyPath parameter."

Step 5 - Distinguish retries from history

Per allure-history:

"Retries: Multiple runs of the same test within a single launch (same allure-results directory)."

"History: Links to the same test across different launches, enabling trend analysis and stability tracking."

Don't clear allure-results between retries within a run - that collapses retry data. Do clear it between launches (different commits / runs) - only history/ subdir is preserved.

Step 6 - CI integration (GitHub Actions)

# .github/workflows/test-with-allure.yml
name: test-with-allure
on:
  pull_request:
  push:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v5
      - uses: actions/setup-node@v4
        with: { node-version: '20' }

      - name: Restore previous Allure history
        uses: actions/download-artifact@v4
        with:
          name: allure-history
          path: previous-allure-report
        continue-on-error: true   # First run on a branch has no history

      - name: Seed history into results dir
        run: |
          mkdir -p allure-results
          if [ -d previous-allure-report/history ]; then
            cp -r previous-allure-report/history allure-results/
          fi

      - name: Run tests with Allure adapter
        run: npm test    # adapter writes to allure-results/

      - name: Install Allure CLI
        run: npm install -g allure-commandline

      - name: Generate report
        if: always()
        run: allure generate allure-results --clean -o allure-report

      - name: Upload report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: allure-report
          path: allure-report
          retention-days: 30

      - name: Save history for next run
        if: always() && github.ref == 'refs/heads/main'
        uses: actions/upload-artifact@v4
        with:
          name: allure-history
          path: allure-report/history
          retention-days: 90

if: always() on the report-generation steps is critical - Allure matters most on failure.

For Allure history hosted on GitHub Pages, the simple-elf/allure-report-action GitHub Action automates the publish-to-Pages workflow (third-party, not first-party Allure tooling).

Step 7 - Add metadata to test cases

Allure's value compounds with metadata. The adapter exposes per-language helpers; the canonical labels are severity (blocker / critical / normal / minor / trivial), epic / feature / story (for BDD-style grouping), and owner.

# pytest example
import allure

@allure.severity(allure.severity_level.CRITICAL)
@allure.epic('Checkout')
@allure.feature('Promo codes')
@allure.story('Apply at checkout')
def test_apply_promo_lowercase():
    ...
// Jest / Mocha example
import { allure } from 'allure-jest';

allure.severity('critical');
allure.epic('Checkout');
allure.feature('Promo codes');

Severity drives the "blocker" / "critical" filter on the report's overview; epic / feature / story drive the BDD-grouped view. Without them, the report is a flat list - usable but missing Allure's main selling point.

Anti-patterns

Anti-patternWhy it failsFix
Skipping the history stepTrends panel is empty; retry/flake history can't be tracked across runs.Wire the copy-history pattern (Step 4) per allure-history.
Clearing allure-results mid-runRetry information is lost - retries appear as separate failed launches.Clear only between launches; preserve history/ subdir per allure-history.
categories.json ordered least-specific firstGeneric "Infrastructure" category catches everything; specific categories never match.Order most-specific first per allure-categories.
Allure as a substitute for JUnit XML in PR gatingAllure's report is for humans, not gating; JUnit XML is the gate.Emit both; gate on JUnit; surface Allure as artifact.
One Allure adapter version + a different CLI versionSchema drift between adapter (results writer) and CLI (results reader); empty / corrupt report.Pin both to compatible versions; bump together per allure-docs.
Adapter installed but no metadataReport is a flat pass/fail list - same value as JUnit XML.Add severity / epic / feature / story (Step 7).
Treating <uuid>-result.json files as durableThe file naming and schema are per-Allure-version internal contracts.Don't post-process raw allure-results/ files; consume the generated allure-report/data/*.json instead.

Limitations

  • No PR-coverage role. Allure tells you which tests passed / failed / flaked; it doesn't measure code coverage. Pair with lcov-analysis / cobertura-analysis / jacoco-analysis.
  • History is per-CI-runner-local unless explicitly persisted. GitHub Actions artifacts have a retention cap (default 90 days); for long-running history, consider GitHub Pages or S3 hosting.
  • Allure 2 vs Allure 3 (allure-history) - Allure 2 is "stable, mature version with broader integrations"; Allure 3 is the rebuild. Many adapters still target 2.x. Don't mix versions.
  • Adapter feature parity varies. Java adapters tend to be the reference; some JS / Python adapters lag behind on step-attachment APIs. Check the per-adapter docs.

References