multiplayer-state-machine-coverage
Build a coverage matrix for a networked-game state machine that exercises connect / authority-handoff / disconnect / reconnect / host-migration paths across Unity Netcode for GameObjects, Unreal Engine replication, and Mirror Networking. Workflow: enumerate the engine's connection states + ownership states + replicated-property update rules, cross them against latency / loss / out-of-order packet injection, encode each combination as a test fixture, and emit a go / no-go gate. Use before submitting a multiplayer title to platform cert - Microsoft's cert guide lists 'Multiplayer does not work as expected' as one of the most common Hold reasons, and Xbox XR-067 (MPSD session state) is failed by uncovered state-machine paths.
multiplayer-state-machine-coverage
Overview
Networked games are state machines: each player connection, each replicated entity, and each authority handoff transitions through a documented set of states. Cert failures and on-the-wire bugs overwhelmingly happen at transition edges - connect, host migration, disconnect-mid-action, reconnect-after-network-loss - not in the steady-state gameplay loop.
This skill is a build-an-X workflow: produce a state-machine coverage matrix that enumerates every connection + ownership + replication state in your engine, crosses it with the network fault matrix (latency / loss / out-of-order / disconnect), and emits a fixture list + a go / no-go gate.
Composes with:
When to use
Microsoft's Certification step-by-step guide explicitly cites "Multiplayer does not work as expected" as one of the three most common reasons titles are placed on Hold. Holds are calendar-week delays - coverage authored ahead of cert is the cheapest mitigation.
Inputs
Before walking the workflow, gather:
| Input | Where | Why |
|---|---|---|
| Engine + netcode stack | Project - Unity NGO / Unreal replication / Mirror | Determines the state vocabulary you enumerate |
| Topology - host / dedicated / listen | Game design doc | Listen-server has different states than dedicated server (see Unreal section below) |
Max concurrent players (MaxPlayers) | Backend config / multiplayer service | Caps the fixture matrix |
| Persistence model - does the session resume after host-migration / suspend? | Game design + platform cert requirements | XR-067 requires the title maintain MPSD session state |
| Platform target - Xbox / PSN / Switch / Steam | Cert plan | Drives the XR / TRC / Lotcheck clauses to cover |
Workflow
Step 1 - Enumerate connection states
Per engine docs, list the states the engine exposes. The matrix is fixed by the framework - you cannot add or remove states, only choose which to cover.
Unity Netcode for GameObjects (per the v2.11 manual - NGO "is a high-level networking library built for Unity for you to abstract networking logic" with "Mono and IL2CPP" support and host / server / client topologies):
| State | Trigger | Observability |
|---|---|---|
Disconnected | Initial / after disconnect | NetworkManager.IsConnectedClient == false |
Connecting | NetworkManager.StartClient() invoked | Between request and approval |
Connected (Approved) | Server accepts client | OnClientConnectedCallback |
Connected (Pending Spawn) | Approved but player object not yet spawned | Wait for OnNetworkSpawn |
Connected (Spawned) | NetworkObject.IsSpawned == true | Gameplay-ready |
Disconnecting | Shutdown() / link loss | OnClientDisconnectCallback fires next |
Host | Same process is both server + client | NetworkManager.IsHost |
Unreal Engine replication (per the Networking Overview: "The server, as the host of the game, holds the one, true, authoritative game state."):
| State | Trigger | Observability |
|---|---|---|
NM_Standalone | Single-player | World->GetNetMode() |
NM_DedicatedServer | "Separate machine with no local players" | IsRunningDedicatedServer() |
NM_ListenServer | "Host machine where the server operator also plays locally" | IsRunningListenServer() |
NM_Client | Connected as remote client | World->IsClient() |
Login | AGameModeBase::PreLogin → Login → PostLogin | Override PostLogin |
Travel (seamless / hard) | ServerTravel to new map | PlayerController->bIsClientReplicationPausedForFrame |
Logout | Logout() callback | Override on GameModeBase |
Mirror Networking (per the Mirror docs on NetworkBehaviour, "a high level Networking library for Unity, optimized for ease of use & probability of success"):
| State | Trigger | Observability |
|---|---|---|
OnStartServer | "called on server when a game object spawns on the server" | NetworkBehaviour override |
OnStartClient | "called on clients when the game object spawns on the client" | NetworkBehaviour override |
OnStartLocalPlayer | Local player only, after OnStartClient | NetworkBehaviour override |
OnStartAuthority / OnStopAuthority | "Called when ownership changes" | NetworkBehaviour override |
OnStopServer / OnStopClient | "Cleanup when objects are destroyed" | NetworkBehaviour override |
Step 2 - Enumerate ownership states
Authority handoff is where most "ghost item" / "ability use after death" bugs live. Per the engine docs:
| Engine | Authority states |
|---|---|
| Unity NGO | OwnerClientId (per NetworkObject); IsOwner, IsServer, IsHost flags |
| Unreal | ROLE_Authority (server), ROLE_AutonomousProxy (owning client), ROLE_SimulatedProxy (other clients), ROLE_None |
| Mirror | isServer, isClient, isLocalPlayer, isOwned per Mirror NetworkBehaviour docs - "isOwned - client has authority over this object" |
Authority transitions to cover:
Step 3 - Enumerate replicated-property transitions
For every replicated property (NGO NetworkVariable<T>, Unreal UPROPERTY(Replicated) / RepNotify, Mirror [SyncVar]), identify:
Each OnRep_ / hook handler is a transition edge that needs at least one fixture exercising it.
Step 4 - Cross with the network fault matrix
The engine state machines are deterministic on a perfect network. Real networks are not perfect. Cross-product each state from Step 1 with:
| Fault | How to inject |
|---|---|
| Latency 50 / 200 / 500 ms | OS-level traffic shaper (tc qdisc add dev eth0 root netem delay 200ms); Unity Multiplayer Tools' Network Simulator; Mirror's LatencySimulation component |
| Packet loss 1 / 5 / 20 % | tc qdisc … loss 5 %; engine-specific simulator |
| Reordering | tc qdisc … delay 50ms reorder 25 % |
| Connection drop | Detach NIC / kill UDP socket / engine-specific Disconnect() |
| Host suspend (console only) | Platform-specific suspend → resume |
| Host migration (where supported) | Force-quit host; verify new host election |
The full Cartesian product is too big - sample by risk-weighted buckets:
Step 5 - Encode each combination as a fixture
For Unity NGO, the fixture is a UTF [UnityTest] PlayMode test (see unity-test-framework):
[UnityTest]
public IEnumerator HostMigration_TransfersAuthority_OnHostDisconnect()
{
// Arrange — host + 2 clients
var host = StartHost();
yield return new WaitForSeconds(1f);
var c1 = StartClient(); yield return new WaitForSeconds(0.5f);
var c2 = StartClient(); yield return new WaitForSeconds(0.5f);
// Spawn an authority-bearing object owned by host
var npc = host.SpawnNpc();
yield return new WaitForSeconds(0.5f);
Assert.AreEqual(host.LocalClientId, npc.OwnerClientId);
// Act — kill the host
host.Shutdown();
// Assert — surviving client becomes new host within
// <= 5 s and reassigns NPC authority.
yield return new WaitForSeconds(5f);
Assert.IsTrue(c1.IsHost || c2.IsHost);
Assert.IsNotNull(NetworkManager.Singleton.SpawnManager.SpawnedObjects[npc.NetworkObjectId]);
}For Unreal, a unreal-automation-system spec wrapping IAutomationDriverModule doesn't drive netcode directly - instead, drive a multi-process test harness (UE 5.x's "Multi-User Editor" / Multi-Process PIE) and use specs to observe the resulting OnRep_ invocations.
For Mirror, the fixture is a Unity NGO-style UTF test plus Mirror's built-in network simulation transports.
Step 6 - Wire to platform-cert clauses
Map every fixture to a specific cert clause it covers. Examples from the Xbox Requirements page:
| Test fixture | Xbox XR covered |
|---|---|
| Client gracefully disconnects on Xbox network loss | XR-074: "Titles must gracefully handle errors with Xbox and partner services connectivity." |
| MPSD session state retains member list across host migration | XR-067: "titles with online multiplayer functionality must maintain session-state information on the Xbox network … through the Xbox Multiplayer Session Directory (MPSD)" |
| Joining via Xbox shell launches into multiplayer session | XR-064: "titles that offer joinable game sessions must enable joinability through the Xbox shell interface" |
| Privilege check before joining MP session | XR-045: XPRIVILEGE_MULTIPLAYER_SESSIONS (ID 254) per the XR-045 privilege table |
| Player communication respects privacy settings | XR-015: CommunicateUsingText / CommunicateUsingVoice privilege checks per the XR-015 permissions table |
| Save roams across console types within a generation | XR-130: "Ensure that saved games work across console types within the generation" |
| Cross-network play visual identification | XR-007: "Games must visually identify Xbox network users when playing with off-network players" |
| Controller disconnect mid-multiplayer | XR-115: re-establish active controller; see XR-115 |
For Sony TRC and Nintendo Lotcheck, the analogous clauses are NDA - cite by stable ID per platform-cert-overview-reference and tag the fixture with the partner-portal clause number.
Step 7 - Emit the go / no-go gate
Aggregate the matrix into a coverage report:
Multiplayer state-machine coverage — MyGame v1.4.2
====================================================
Connection states enumerated: 7 / 7 ✓
Ownership transitions enumerated: 5 / 5 ✓
Replicated-property edges enumerated: 23 / 23 ✓
Fault-matrix coverage:
High-risk bucket: 18 / 18 fixtures ✓
Medium bucket: 9 / 12 fixtures (75 %) ⚠
Low bucket: 3 / 4 fixtures (75 %)
Cert-clause coverage:
XR-067 MPSD session state ✓
XR-074 Service connectivity loss ✓
XR-064 Joinable via shell ✓
XR-045 Privilege checks ✓
XR-015 Comm-privacy ⚠ (CommunicateUsingVoice path uncovered)
XR-115 Controller add/remove mid-MP ✓
XR-130 Save roams across SKUs ✓
VERDICT: NO-GO (XR-015 voice-privacy path uncovered;
medium-risk bucket below 80 % threshold)The gate refuses to advance to platform-cert submission until every cert-mapped clause is covered and the high-risk bucket is at 100 %.
Worked example - Unity NGO host migration
Inputs:
Step 1 - connection states from the NGO v2.11 manual: Disconnected, Connecting, Connected (Approved), Connected (Pending Spawn), Connected (Spawned), Disconnecting, Host.
Step 2 - ownership transitions: spawn → owner assigned; host quits → ownership re-elected; client picks up host-owned item.
Step 3 - replicated properties under coverage: currentHealth (NetworkVariable<float>), inventoryHash (NetworkVariable<int>), questFlags (NetworkVariable<NetworkSerializableQuestState>).
Step 4 - fault matrix selection: high-risk = Host → Connected (other client takes over) under each of {200 ms latency, 5 % loss, host-kill, host-suspend (Xbox)}.
Step 5 - encode each combination as a UTF PlayMode [UnityTest] (see code sample in Step 5 above).
Step 6 - map fixtures to cert clauses (per the XR list):
Step 7 - emit gate. Failing fixture: voice mute is reapplied after host migration only 4 / 5 runs (flake). Verdict: NO-GO, flake on the XR-015 voice path; needs a deterministic re-application path before cert.
Anti-patterns
| Anti-pattern | Why it fails | Fix |
|---|---|---|
| Testing only the happy path | Cert findings concentrate on transition edges | Cover every state × fault combination in the high-risk bucket |
| LAN-only multiplayer testing | Submission fails under WAN latency / loss | Inject latency + loss with tc qdisc or engine simulator |
| No host-migration coverage on titles that claim to support it | XR-067 fails mid-session | At least one fixture per supported migration path |
Ignoring IsOwned / authority flags in tests | False positives (test passes because client mirrors authority anyway) | Per Mirror docs, assert isOwned / IsOwner explicitly |
| Replication-property hook coverage by inspection only | OnRep_ doesn't fire if value unchanged - silent contracts | Tests that explicitly mutate the property and assert the hook ran |
| Coverage matrix only on engine states, not cert clauses | Passes internal QA, fails cert | Step 6 mapping is mandatory |
| Trusting "host migration works" without a deterministic election test | Election timing is racy | Bound the election window (e.g., new host elected within 5 s) and assert on it |
Using [ClientRpc] for all communication | Bandwidth hog; non-reliable RPCs preferred for frequent calls per Networking Overview | Replicated properties for state; RPCs only for events |
| Voice chat covered only with text chat | Per XR-015 permission table, CommunicateUsingText and CommunicateUsingVoice are separate privileges | Test both paths independently |