Testland

HICCUPPS-F: Nine Testing Oracles for Defending Your Bug Calls

TestlandJuly 4, 2026

HICCUPPS-F: nine testing oracles beyond "the spec says so" for recognizing and defending a bug call before it gets waved off as works as designed in triage.

The nine HICCUPPS-F testing oracles, arranged as a reference wheel: History, Image, Comparable products, Claims, Users' desires, Product consistency, Purpose, Standards and statutes, and Familiar problems. Claims, the oracle most bug reports lean on by default, is highlighted.

A checkout screen shows $24.99. The emailed receipt shows $25.49. A tester files the bug, and a lead closes it "works as designed": the spec never said the two totals must match, and the spec was the only oracle anyone consulted. That's how real defects get waved off in triage, one reference point treated as the only one that counts.

Michael Bolton catalogued eight other legitimate references testers already have, in a checklist he published on developsense.com in July 2012. This post assumes only that you've filed a bug report and done some hands-on testing. HICCUPPS-F is the name for those nine oracles: run them before ruling anything "not a bug," and cite them by name when someone tries to dismiss the call.

HICCUPPS-F is a nine-oracle checklist testers run before ruling something "not a bug": History, Image, Comparable products, Claims, Users' desires, Product consistency, Purpose, Standards and statutes, and Familiar problems. Each oracle is a distinct kind of consistency to check the software against, and a bug is anything that violates one or more of these principles.

  • Eight oracles check for consistency; the ninth, Familiar problems, is the mirror-image oracle for known bug patterns.
  • The original list came from James Bach; Michael Bolton published the extended "FEW HICCUPPS" write-up on developsense.com in July 2012.
  • Bolton's current mnemonic is the 11-oracle "FEW HICCUPPS" (adds Explainability and World, moves Familiar to the front); this post teaches the nine-oracle spine inside it.
  • An oracle, in testing, is "a way to recognize what might be a problem," per Ministry of Testing.

Why bug reports get closed as "works as designed"

An oracle, in the testing sense, has nothing to do with prophecy or Oracle Corporation. It's a way to recognize whether something might be a problem: a reference point you compare the software's behavior against. A heuristic is the companion idea, a fallible rule of thumb rather than an algorithm, one that helps you decide fast without guaranteeing the answer. The ISTQB syllabus uses both words the same way, without endorsing any specific oracle framework.

Most testers, without naming it, run exactly one oracle: does the software match the spec? That's the Claims oracle, and it's a legitimate oracle. It just isn't the only one. When a tester's entire case rests on "the spec doesn't forbid this," a lead who checks the spec and finds nothing forbidding it can close the ticket in 30s. The bug report loses because the oracle-of-one lost, not because the behavior was fine.

"Works as designed" survives specifically because most bug reports offer the reviewer nothing else to check against. Widen the reference set to nine oracles and a WAD (works as designed) dismissal has to argue against all nine, not just the spec. The oracles apply to any bug report, with or without a charter, though a well-scoped charter can flag which ones are worth running before a session starts.

James Bach's original oracle list, and Michael Bolton's HICCUPPS-F write-up

Attribution matters, because it's easy to credit a mnemonic to whoever wrote the post you found it in. Bolton is explicit about this in the write-up itself: "The original list came from James Bach" (developsense.com, July 2012). Bach built the list as part of Rapid Software Testing, the methodology he developed and taught for decades. Bolton extended it, named it, and published the write-up most testers now cite.

The honest complication: the version most people quote isn't Bolton's current version. His fuller mnemonic is "FEW HICCUPPS," eleven oracles rather than nine. He added Explainability (can you explain the behavior to a stakeholder?) and World (consistent with what's plausible or expected of the world at large), and moved Familiar problems to the front of the acronym rather than the back. That's the version on developsense.com today.

HICCUPPS-F, the nine-oracle spine this post works through, is a useful subset of FEW HICCUPPS, not a competing or outdated framework. It's the version that fits on an index card and covers the oracles that come up in nearly every bug call. Getting the attribution and version history right separates citing this properly from repeating secondhand folklore.

One discrepancy: $24.99 on screen, $25.49 on the receipt

Here's the scenario the rest of this post runs through nine times: a checkout cart applies a promo code and displays a discounted total of $24.99 on the confirmation screen. The email receipt, generated moments later as a PDF, shows $25.49 for the same order.

Fifty cents. A tester who only checks the spec for "the total must be exact" finds no such line, calls it a rounding quirk, and moves to the next ticket. That's the single-oracle failure mode from the previous section, played out with a real number: nothing in "the spec is silent on this" tells you whether $0.50 is noise or evidence of a broken discount calculation somewhere in the order pipeline.

Running one discrepancy through nine oracles

Run the $24.99/$25.49 mismatch through each of the nine oracles and the picture changes fast.

History: did earlier versions keep the totals in sync?

History asks whether current behavior matches the product's own past behavior. A screenshot from v1.4 shows the cart total and the receipt matching to the cent for the same kind of promo-code order. Nothing in the checkout flow changed since then except the tax-calculation service. That turns a shrug ("maybe it's always been like this") into a specific claim: this is a regression introduced in a recent release, not a longstanding quirk nobody noticed.

Image: what a mismatched receipt says about the brand

Image checks behavior against the reputation a company wants to project, independent of any spec. A shopper who catches their receipt disagreeing with what they were charged doesn't reason through promo-code edge cases: they conclude the company is careless with their money. That reaction doesn't wait for a root-cause investigation, which is why "technically not a spec violation" doesn't land as a defense with the person who received the email.

Comparable products: what other checkouts do

Comparable products asks whether behavior matches how similar systems handle the same situation. Well-run checkouts treat the displayed total and the emailed receipt as one number shown twice, not two calculations that can drift apart. A shopper who's made dozens of online purchases carries an implicit baseline from every other checkout they've used, and this mismatch breaks it without needing to name a specific competitor.

Claims: what the promo-code spec actually says

Claims is the oracle from the previous section, the one every tester already reaches for, and the one that produced the "works as designed" close in the first place. The actual promo-code specification says the TAX10 discount applies pre-tax. The receipt template applies it post-tax, changing the total by exactly the 50 cents in question. That's a direct violation of a documented rule, and the fastest oracle to cite, needing no interpretation beyond a section number.

Users' desires: the number the shopper agreed to pay

Users' desires checks behavior against what a reasonable shopper actually wants, broader than whatever the spec bothered to write down. The shopper agreed to pay $24.99 on the confirmation screen and expects the transaction record to say the same. When the receipt says $25.49 instead, the desire that gets violated is simple: the number a shopper agreed to and the number on file should match. That's the desire a support agent hears about once the dispute lands in their queue.

Product consistency: whether the cart and receipt agree with each other

Product consistency stays entirely inside the product, comparing one element against a comparable element in the same system rather than anything external. The cart total and the receipt total are two representations of the exact same transaction, generated by the same order, and should match each other before anyone opens the promo-code spec. A tester can flag this without outside research: open the cart, open the receipt, and note the two numbers disagree.

Purpose: what the discount exists to do

Purpose checks behavior against the explicit and implicit uses people put a feature to, not just its literal spec. A promo code exists to close a sale and build enough trust that a shopper checks out today instead of comparison-shopping tomorrow. A receipt that contradicts the agreed price undermines exactly that purpose, planting doubt at the moment the feature was designed to remove it. The bug isn't just wrong; it works against the reason the discount exists.

Standards and statutes: when a wrong receipt becomes a compliance issue

Standards and statutes checks behavior against relevant laws and regulations, and it's the one oracle here that has to hedge honestly. Some jurisdictions regulate the gap between a displayed price and the amount actually charged; others don't touch receipt formatting at all. Whether a $0.50 mismatch trips a consumer-protection rule depends on where the shopper is located, and that isn't something a tester can settle from the ticket alone. This oracle produces a genuine maybe, not a yes or a shrug.

Familiar problems: the rounding-order bug you've seen before

Familiar problems flips the logic of the other eight: you expect the product to be inconsistent with patterns from bugs seen before in unrelated systems, so it's a problem when it matches one. Applying a discount before tax instead of after is one of the oldest rounding-order bugs in checkout code, and a tester who's debugged one recognizes the shape immediately: fifty cents, tax involved, discount applied. A single regression assertion comparing cart and receipt totals would have caught it before release, a case for pairing exploratory judgment with automated checks instead of running them as separate tracks.

The nine oracles, side by side

Laid out side by side, the pattern is stark: eight of the nine oracles independently flag the $24.99/$25.49 mismatch as a real problem, and only Standards and statutes returns a genuine maybe.

OracleWhat it checks againstCart-receipt verdict
HistoryThe product's own past behaviorRegression: v1.4 matched, current build doesn't
ImageThe reputation the company wantsReads as careless with the shopper's money
Comparable productsHow similar systems behaveWell-run checkouts keep both totals in sync
ClaimsWritten specs and documentationDirect TAX10 pre-tax/post-tax violation
Users' desiresWhat a reasonable shopper wantsAgreed price should match the recorded price
Product consistencyOther elements in the same productCart and receipt disagree on one transaction
PurposeThe feature's explicit and implicit useUndermines the trust the discount exists to build
Standards and statutesRelevant laws and regulationsDepends on jurisdiction: a genuine maybe
Familiar problemsKnown bug patternsClassic discount-then-tax rounding-order bug

Eight "yes" verdicts and one honest "maybe" is a case triage can't wave off as works as designed, no matter which single oracle the original bug report happened to cite.

Where HICCUPPS-F runs out of answers

HICCUPPS-F has real limits. First, oracles can disagree with each other, not just with the spec. If the promo-code spec had explicitly required post-tax rounding, Claims would side with the current receipt while Product consistency still flags the cart-receipt mismatch. Reconciling two oracles that point in opposite directions takes judgment, not a vote count.

Second, the framework is skill-dependent. A senior tester runs all nine fluently in a few minutes of triage. A newer tester needs the oracle table open as a literal checklist, working through each row deliberately until the habit sticks.

Third, some oracles cost more than a 15min triage allows. Users' desires needs real user or support data testers rarely have on hand mid-triage; citing it honestly sometimes means flagging it for follow-up, not resolving it on the spot.

The sharper point: "no oracle flags it" doesn't mean "not a bug." Familiar problems is built to catch what the other eight miss, precisely because it looks for expected inconsistency rather than consistency. Skipping it is the most common misapplication of this framework.

What naming the oracle changes in your bug report

The payoff shows up in the "why is this a bug?" line of the ticket, not in a theory of testing. "Seems wrong" gets closed in 30s. "Claims: violates the TAX10 pre-tax rounding rule; Familiar problems: classic discount-then-tax rounding-order bug" gets read. A report that cites even one oracle beyond the spec is far harder to dismiss as works as designed, because the reviewer now has to argue against a named reference point, not a feeling.

The nine oracles here are the working spine of Bolton's fuller "FEW HICCUPPS" mnemonic, useful on their own and worth the deeper version once the habit sticks. Testland keeps a HICCUPPS-F reference built to drop straight into a charter or a bug-report template.