Testland
Browse all skills & agents

tenant-isolation-models-reference

Pure-reference catalog of tenant-isolation models for B2B SaaS. Defines the isolation continuum from full-isolation (separate compute + data + network per tenant) to fully-shared (one deployment, tenant_id discriminator), names the canonical models (Microsoft's automated-single-tenant / fully-multitenant / vertically-partitioned / horizontally-partitioned; AWS Well-Architected's silo / pool / bridge framing; deployment-stamps / supertenants terminology), enumerates the trade-offs (cost, blast radius, noisy neighbor, compliance, scale limits), and lists the test surfaces each model creates (cross-tenant data leak, tenant-id propagation, deployment-routing). Use as the model-selection reference when designing or auditing tenant isolation. Consumed by tenant-leak-test-author, cross-tenant-data-leak-tests, tenant-leak-critic, tenant-id-propagation-tracer.

tenant-isolation-models-reference

Overview

Tenant isolation is the foundational concern of every B2B SaaS architecture. AWS Well-Architected SaaS Lens calls it "essential" and notes that crossing a tenant boundary represents a "significant and potentially unrecoverable event for a SaaS business" (per docs.aws.amazon.com/wellarchitected/latest/saas-lens/tenant-isolation.html).

Isolation is not a binary property - Microsoft's Azure Architecture Center frames it as a continuum from fully isolated (shared-nothing) to fully shared (everything shared), with architectures often picking different points per tier (UI shared, app shared, data isolated, for example). Per learn.microsoft.com/en-us/azure/architecture/guide/multitenant/considerations/tenancy-models: "Instead of viewing isolation as a discrete property, consider it a spectrum."

This skill is a pure reference consumed by the per-model test authors + the tenant-leak critic. It does not execute anything.

When to use

  • Designing the tenant-isolation model for a new B2B SaaS product or feature.
  • Auditing an existing model - does the testing surface match the declared isolation level?
  • Choosing what to test: each model creates a distinct set of failure modes the test suite must cover.
  • PR review of architecture changes that move components along the isolation continuum.

Tenant vs deployment

A tenant is a logical customer boundary - typically a B2B customer organisation, sometimes a consumer family / group. A deployment is a physical set of infrastructure. The two are not the same: a deployment can host many tenants (shared model), or each tenant can have its own deployment (silo model). Per the Azure Architecture Center, deployments are also called stamps or supertenants in some frameworks.

The mapping is durable state: a tenant-to-deployment table must exist somewhere so requests route to the right deployment.

The four canonical models

1. Automated single-tenant (silo / fully-isolated)

PropertyValue
ComputeDedicated per tenant
DataDedicated per tenant
NetworkDedicated per tenant
Cost per tenantHighest
Blast radiusOne tenant
Noisy neighborNone

Per Microsoft Azure Architecture Center: "you deploy a dedicated set of infrastructure for each tenant... a key benefit is that data for each tenant is isolated, which reduces the risk of accidental leakage."

When to choose: regulated industries with strong isolation mandates (healthcare HIPAA, financial services, government); a small number of high-value enterprise customers; per-tenant configuration is part of the value proposition.

Test surface: deployment automation (per the Deployment Stamps pattern); cross-deployment operations like reporting; tenant-to- deployment routing.

2. Fully multitenant (pool / fully-shared)

PropertyValue
ComputeShared
DataShared (single DB with tenant_id discriminator)
NetworkShared
Cost per tenantLowest
Blast radiusAll tenants
Noisy neighborHigh

Per Microsoft Azure Architecture Center on the risks: "Be sure to separate data for each tenant, and don't leak data among tenants... a large tenant trying to perform a heavy query or operation might affect other tenants."

When to choose: large number of low-margin customers; high operational efficiency required; tenants accept shared infrastructure.

Test surface: cross-tenant data leak (the canonical risk), tenant_id propagation through every code path, noisy-neighbor behaviour, resource quotas per tenant.

3. Horizontally partitioned (bridge)

PropertyValue
ComputeShared
DataDedicated per tenant
NetworkShared
Cost per tenantMedium
Blast radiusApp-tier shared, data isolated
Noisy neighborApp-tier yes, data tier no

Per Microsoft: "build a single application tier and then deploy individual databases for each tenant... helps mitigate a noisy neighbor problem [in the data tier]."

When to choose: data isolation matters for compliance, but shared compute is acceptable; data-tier noisy neighbors are the dominant failure mode (heavy queries, large indexes).

Test surface: correct database routing per tenant; connection- pool exhaustion under tenant concurrency; cross-DB query attempts must fail.

4. Vertically partitioned

PropertyValue
ComputeMix (some tenants dedicated, others shared)
DataMix
NetworkMix
Cost per tenantMixed
Blast radiusPer-tier decision
Noisy neighborPer-tier

Per Microsoft: "a combination of single-tenant and multitenant deployments... most customers' data and application tiers on multitenant infrastructures, but you deploy single-tenant infrastructures for customers who require higher performance or data isolation." This includes geographic partitioning (one deployment per region, tenants mapped to nearest region).

When to choose: majority of customers fit the shared model, but a minority need silo (enterprise tier); geographic data- residency requirements.

Test surface: every test from the shared model plus every test from the silo model; tenant migration between tiers; pricing tier enforcement.

Isolation tier mapping

A common pattern is independent isolation per architecture tier:

TierCommon choice
UIShared host name (fully multitenant)
API gatewayShared, with tenant claim in JWT
Application servicesShared, tenant context in every request
Async queues / topicsShared topic with tenant_id message attribute, or per-tenant queue
DataOften partitioned: tables with tenant_id (pool); schemas per tenant (bridge); databases per tenant (silo)
Object storagePer-tenant prefix in bucket (pool); bucket per tenant (silo)
Search indexPer-tenant routing key (pool); index per tenant (silo)

The test surface depends on the lowest isolation level in the stack. A fully isolated UI but shared database still requires the full cross-tenant data-leak test battery against the database.

Isolation enforcement primitives

Tenant isolation is implemented by combining:

  • Identity context - tenant_id in JWT claims (auth.jwt() in Supabase per supabase.com/docs/guides/database/postgres/row-level-security) or AWS Cognito ID token; the source of truth for "who is this request for".
  • Authorisation policy - Postgres Row-Level Security per row-level-security-postgres-reference, AWS IAM dynamic policies generated per tenant, application- level authorisation middleware.
  • Resource ABAC tags - tag each tenant resource with tenant-id=<x>, then enforce via IAM condition keys.
  • Network segmentation - per-tenant VPCs / subnets / security groups (silo only).
  • Encryption keys - per-tenant KMS keys (silo / bridge); useful for crypto-shredding on tenant offboarding.

Anti-patterns

Anti-patternWhy it failsFix
tenant_id filter only in application codeOne missed query path = cross-tenant leakPush the filter to the database (RLS) or row-attribute IAM
tenant_id from request header / bodySpoofable; tenant A can claim to be tenant BAlways derive tenant_id from authenticated JWT/session, never from request payload
Trust the JWT raw_user_meta_data for tenant claimsUser-modifiable per Supabase docsUse raw_app_meta_data (server-set) or a server-side claim store
Single connection pool for all tenantsOne slow tenant query blocks allPer-tenant pools, or quota-aware pools
Shared object-storage bucket without prefix isolationObject enumeration leaks across tenantsPer-tenant prefix + IAM condition on the prefix
No isolation tests in CIModels drift over timeCross-tenant leak tests in every PR per cross-tenant-data-leak-tests
Migration scripts run without tenant contextSchema changes touch all tenants at once; high blast radiusStamp pattern with progressive rollout

Test surface by model

ModelRequired test categories
Silo / single-tenantTenant-to-deployment routing; per-deployment health; deployment automation
Pool / fully-sharedCross-tenant data leak (highest priority); tenant_id propagation; noisy-neighbor mitigation; quota enforcement
Bridge / horizontalPool tests + database routing per tenant; cross-database query rejection
VerticalPool + silo tests + tier-migration tests

The cross-tenant data leak suite is the universal floor: even silo deployments share some surface (account-management APIs, billing, identity providers) where pool-like leaks are possible.

Limitations

  • No model is leak-proof by construction. Silo defends against most cross-tenant leaks but inherits leak risk in any shared management surface (admin UI, billing). RLS defends DB but not application caches.
  • Cost vs isolation is a real trade-off. Microsoft's docs note that "if a single tenant requires a specific infrastructure cost, 100 tenants probably require 100 times that cost" in pure silo.
  • Compliance scope. Some regulators (e.g., FedRAMP High, certain healthcare regimes) effectively mandate silo for certain data classifications. Check counsel-of-record before assuming pool is acceptable.
  • Azure subscription / AWS account limits. Per Microsoft docs: "you're more likely to reach Azure resource scale limits when you have a shared set of infrastructure." Pool models hit account-level quotas faster than silo.

References