cron-job-test-author
Build-an-X for cron / scheduler job tests - cron-expression validation patterns (5-field standard `min hour day-month month day-week` + 6-field with seconds + named-list extensions), DST + leap-day edge cases, missed-execution detection (machine downtime catch-up), overlapping-run protection (lock + stale-lock recovery), timezone semantics. Use when authoring tests for cron jobs, Kubernetes CronJobs, BullMQ repeat-jobs, Sidekiq schedulers, or any time-based job runner.
cron-job-test-author
Overview
Cron jobs are universally underspecified. Most teams ship cron expressions, never test them, and discover bugs when DST or a leap day rolls around. This skill is a build-an-X workflow for authoring cron-job tests - a checklist + per-pattern test recipes, not a single tool.
When to use
Step 1 - Validate the cron expression
The 5-field standard:
┌─── minute (0-59)
│ ┌─── hour (0-23)
│ │ ┌─── day of month (1-31)
│ │ │ ┌─── month (1-12 or JAN-DEC)
│ │ │ │ ┌─── day of week (0-6 or SUN-SAT; 0 and 7 both = Sunday)
│ │ │ │ │
* * * * *Six-field variants (Quartz, BullMQ pattern mode) prepend a seconds field.
Default: croniter (Python) - it both validates expressions and computes next-run times, which Steps 2 + 6 below depend on. Use a language-native validator when the test suite isn't Python: cron-validator (Node), CronExpression.isValidExpression() (Java/Quartz), or crontab.guru for ad-hoc human checks.
Test pattern:
from croniter import croniter
import pytest
@pytest.mark.parametrize("expr", [
"0 3 * * *", # daily 03:00
"0 0 1 * *", # monthly on the 1st
"*/15 * * * *", # every 15 min
"0 9 * * 1-5", # weekdays at 09:00
])
def test_cron_expression_is_valid(expr):
assert croniter.is_valid(expr)Step 2 - DST + leap-day edge cases
The two highest-risk dates per year:
Test pattern:
from croniter import croniter
from datetime import datetime
from zoneinfo import ZoneInfo
def test_daily_2am_handles_dst_spring_forward():
# 2026 spring-forward in US/Eastern: Mar 8, 2:00 AM EST → 3:00 AM EDT
base = datetime(2026, 3, 8, 1, 0, tzinfo=ZoneInfo("US/Eastern"))
next_run = croniter("0 2 * * *", base).get_next(datetime)
# croniter's behavior: skip the missing 02:00 hour, return 03:00 EDT
assert next_run.hour == 3
assert next_run.utcoffset().total_seconds() == -4 * 3600 # EDTRecommendation: schedule cron jobs in UTC where possible to avoid DST entirely. If local time is required, document the DST-handling decision in the cron-job code.
Step 3 - Missed-execution detection
When the host / cluster is down at the scheduled time, what happens?
Test pattern (Kubernetes CronJob):
# CronJob with deadline
spec:
schedule: "0 3 * * *"
startingDeadlineSeconds: 300 # if not started within 5 min, skip
concurrencyPolicy: Forbid # don't overlap with previous run# Test: simulate cluster downtime (drain nodes), then re-enable past 03:00 + 5min
# → CronJob controller should NOT trigger the missed run (past deadline)
# → CronJob controller SHOULD trigger if re-enabled past 03:00 but within deadlineFor OSS test patterns, use kind (Kubernetes IN Docker) clusters in CI to verify CronJob behavior.
Step 4 - Overlapping-run protection
If a 03:00 cron job runs longer than expected and 04:00 schedule fires before it finishes, what happens?
Test pattern (lock-based):
import fcntl, sys
def acquire_lock(lock_file):
fp = open(lock_file, 'w')
try:
fcntl.flock(fp, fcntl.LOCK_EX | fcntl.LOCK_NB)
except BlockingIOError:
sys.exit(0) # previous run still active; skip
return fp
def test_lock_prevents_overlap(tmp_path):
lock_file = tmp_path / "job.lock"
fp1 = acquire_lock(str(lock_file))
# While fp1 holds the lock, second acquire should sys.exit(0):
with pytest.raises(SystemExit):
acquire_lock(str(lock_file))Step 5 - Stale-lock recovery
A long-held lock from a crashed job blocks all future runs. Test pattern:
def test_stale_lock_age_check(tmp_path):
lock_file = tmp_path / "job.lock"
lock_file.touch()
# Set mtime to 25h ago (job should have completed by then):
old_time = time.time() - 25 * 3600
os.utime(lock_file, (old_time, old_time))
# Recovery: detect stale lock, remove, re-acquire
if lock_file.stat().st_mtime < time.time() - 24 * 3600:
lock_file.unlink()
# Now acquire fresh lock → should succeedStep 6 - Timezone semantics
def test_cron_runs_at_specified_tz():
base = datetime(2026, 5, 6, 0, 0, tzinfo=ZoneInfo("UTC"))
# Schedule 09:00 in Tokyo (UTC+9):
cron = croniter("0 9 * * *", base, tz="Asia/Tokyo")
next_run = cron.get_next(datetime)
# Should be 00:00 UTC the same day:
assert next_run.astimezone(ZoneInfo("UTC")).hour == 0For Kubernetes CronJobs, schedule timezone is set via .spec.timeZone (Kubernetes 1.27+).
Step 7 - End-to-end test recipe
For each cron job in scope:
Anti-patterns
| Anti-pattern | Why it fails | Fix |
|---|---|---|
| Trust the cron expression without parsing it | Typos like * * * 13 * (invalid month) silently never trigger | Validate with croniter / cron-validator (Step 1) |
| Schedule in local time without documenting TZ | DST + cross-region deployments cause silent shifts | UTC where possible (Step 2) |
| No overlap protection on long-running jobs | Concurrent runs corrupt state | Lock pattern + concurrency policy (Step 4) |
| Locks without staleness recovery | Crashed job blocks all future runs forever | Time-based stale check (Step 5) |
| No alerting on missed runs | Job silently stops; discovered weeks later | Synthetic-monitor + heartbeat (cross-ref qa-shift-right/synthetic-monitor-author) |