Testland
Browse all skills & agents

papermill-tests

Use Papermill to parameterize and execute notebooks in CI as regression tests - `papermill input.ipynb output.ipynb -p alpha 0.6` (CLI) or `pm.execute_notebook(...)` (Python API). Pairs with nbval (output assertion) and testbook (function unit tests) for full-coverage notebook QA.

papermill-tests

Papermill executes notebooks programmatically with injected parameters, producing an output notebook with results. Per the Papermill execute docs, it pairs naturally with regression testing: run a parameterized notebook in CI, assert on outputs.

When to use

  • Parameterized analysis notebooks: same notebook, different inputs (per-region, per-month, per-customer-segment).
  • Production-grade notebook execution (Airflow / Argo / Prefect / cron) - papermill is the standard executor.
  • Regression test: re-run the notebook with known inputs; assert output values match expected (often paired with nbval/testbook).

Step 1 - Install

pip install papermill

Per the Papermill execute docs.

Step 2 - Tag the parameters cell

In your notebook, tag one cell with parameters:

# Cell tagged "parameters"
alpha = 0.5
ratio = 0.2
input_path = "data/sales.parquet"

Papermill replaces these with injected values at execution time (adds an injected-parameters cell after the tagged cell).

Step 3 - Python API execution

import papermill as pm

pm.execute_notebook(
    'path/to/input.ipynb',
    'path/to/output.ipynb',
    parameters=dict(alpha=0.6, ratio=0.1)
)

Per the Papermill execute docs.

Step 4 - CLI execution

# Local in/out
papermill local/input.ipynb local/output.ipynb -p alpha 0.6 -p ratio 0.1

# S3 output
papermill local/input.ipynb s3://bkt/output.ipynb -p alpha 0.6 -p l1_ratio 0.1

Parameter flags per the Papermill execute docs:

FlagMeaning
-p NAME VALSimple parameter (auto-typed)
-r NAME VALRaw string (preserve as string)
-f file.yamlParameters from YAML file
-y "key: val"Inline YAML (supports lists, dicts)
-b base64yamlBase64-encoded YAML

Step 5 - Use as regression test

import json
import papermill as pm
import nbformat

def test_analysis_with_known_inputs(tmp_path):
    out_path = tmp_path / "out.ipynb"
    pm.execute_notebook(
        'analysis.ipynb',
        str(out_path),
        parameters=dict(seed=42, n_samples=1000),
    )

    nb = nbformat.read(str(out_path), as_version=4)
    final_cell = nb.cells[-1]
    output_text = final_cell.outputs[0]['text']
    result = json.loads(output_text)

    assert abs(result['mean'] - 0.5) < 0.01
    assert result['n'] == 1000

The output notebook is artifact-friendly - attach to CI runs for review when assertions fail.

Step 6 - Parameter sweeps in CI

# GitHub Actions matrix sweep
strategy:
  matrix:
    seed: [42, 123, 7]
    n_samples: [100, 1000]
steps:
  - run: |
      papermill analysis.ipynb out-${{ matrix.seed }}-${{ matrix.n_samples }}.ipynb \
        -p seed ${{ matrix.seed }} \
        -p n_samples ${{ matrix.n_samples }}
  - uses: actions/upload-artifact@v4
    with:
      name: papermill-output-${{ matrix.seed }}-${{ matrix.n_samples }}
      path: out-${{ matrix.seed }}-${{ matrix.n_samples }}.ipynb

Step 7 - Pair with nbval / testbook

ToolStrengthPair with papermill how
nbvalFull-notebook output regressionRun papermill first (parameter inject) → run nbval on output
testbookFunction-level unit teststestbook can use papermill's executor under the hood - see testbook configuration for execute_kwargs

Papermill is the engine; nbval and testbook are the assertion layers. Use all three for production notebook QA.

Step 8 - TQDM progress descriptions

Add comments at cell start:

#papermill_description=load_data
df = load_dataset()

#papermill_description=train_model
model.fit(df)

Per the Papermill execute docs: integrates with TQDM for meaningful CI progress indicators.

Anti-patterns

Anti-patternWhy it failsFix
Forget the parameters cell tagParameters never inject; notebook runs with defaultsTag the cell explicitly (Step 2)
Mix -p and -r types incorrectly-p version 1.0 becomes float 1.0; loses leading zeros etc.Use -r for strings (Step 4)
Run papermill against side-effect notebooks (writes to prod DB)Papermill is non-transactional; partial failures leave bad stateUse ephemeral workdirs / staging credentials in test runs
Ignore the output notebook (only check exit code)Subtle errors visible only in cell outputsSave + inspect output notebook (Step 5); upload as artifact (Step 6)
Skip seed parameterizationTests flake on stochastic modelsAlways -p seed N for reproducible runs

Limitations

  • Papermill executes via the standard Jupyter kernel; very long notebooks have higher OOM risk than equivalent .py scripts.
  • Output notebooks are large (full re-render of all cells); CI artifact storage adds up - consider retention policy.
  • Parameter injection is one-shot at notebook start; cannot re-parameterize mid-run.

References