sse-load-test
Load-tests SSE endpoints at scale with k6 - measures concurrent-stream capacity, connection churn, and server memory pressure. Covers the HTTP/1.1 6-connection-per-origin browser ceiling vs HTTP/2 multiplexing, a custom k6 SSE client built on ReadableStream, and threshold gates for TTFB and data throughput. Use when validating whether a server can sustain N concurrent EventSource connections without connection starvation or memory growth.
sse-load-test
Load-tests SSE endpoints with k6, covering the HTTP/1.1 connection-ceiling problem that server-sent-events-tests flags but does not exercise under load. The sibling skill verifies correctness (format, Last-Event-ID, readyState); this skill drives concurrent virtual users against the same endpoint to find capacity limits, connection churn costs, and memory growth under sustained streams.
When to use
Overview: the HTTP/1.1 connection ceiling
Browsers cap concurrent HTTP/1.1 connections per origin at approximately 6 (Chrome, Firefox, Safari all implement this limit as a quality-of-service constraint on shared TCP stacks). The WHATWG SSE spec authoring notes acknowledge this directly: "Clients that support HTTP's per-server connection limitation might run into trouble when opening multiple pages from a site if each page has an EventSource to the same domain."
Each EventSource holds one HTTP/1.1 connection open for the lifetime of the stream. A page that opens several EventSource objects, or a multi-tab scenario, will exhaust the pool and stall all other requests to the same origin. HTTP/2 removes this constraint: RFC 9113 section 1 specifies that "a single HTTP/2 connection can contain multiple concurrently open streams, with either endpoint interleaving frames from multiple streams," so all EventSource connections to an HTTP/2 origin share one TCP connection.
k6 uses Go's net/http transport which respects HTTP/2 upgrade and does not enforce a browser-style per-origin limit; its VU concurrency is tunable, making it suitable for finding the server-side ceiling independently of browser caps.
Step 1 - Install k6
Follow the k6 installation guide for your OS. Verify:
k6 versionThe k6/experimental/streams module used below requires k6 v0.54.0 or later (see experimental/streams release note).
Step 2 - Write a k6 SSE client
k6 does not ship a native SSE protocol module. Build one using ReadableStream from k6/experimental/streams, which "provides a way to define and consume streams of data within your test scripts" and lets you "start processing raw data with Javascript bit by bit, as soon as it's available, without needing to generate a full in-memory representation."
// sse-load.js
import http from 'k6/http';
import { check } from 'k6';
import { ReadableStream } from 'k6/experimental/streams';
import { Counter, Trend } from 'k6/metrics';
// Custom metrics
const sseEvents = new Counter('sse_events_received');
const eventLag = new Trend('sse_event_lag_ms');
export const options = {
scenarios: {
// constant-vus holds N virtual users open for the full duration;
// each VU maps to one persistent SSE connection.
// See https://grafana.com/docs/k6/latest/using-k6/scenarios/executors/constant-vus/
sustained_streams: {
executor: 'constant-vus',
vus: 50,
duration: '60s',
},
},
thresholds: {
// TTFB: first byte of the event stream arrives within 500 ms for 95% of VUs
// Metric reference: https://grafana.com/docs/k6/latest/using-k6/metrics/reference/
'http_req_waiting': ['p(95)<500'],
// Connection-slot starvation signal: VUs blocked waiting for a free TCP
// slot should be negligible
'http_req_blocked': ['p(99)<100'],
// Overall error rate
'http_req_failed': ['rate<0.01'],
},
};
export default async function () {
const startMs = Date.now();
// Long-lived GET; set timeout generously (k6 default is 60 s per
// https://grafana.com/docs/k6/latest/javascript-api/k6-http/params/ ).
// responseType 'text' streams body as string.
const res = http.get('http://localhost:3000/api/events', {
headers: { Accept: 'text/event-stream', 'Cache-Control': 'no-cache' },
timeout: '70s',
});
check(res, {
'status 200': (r) => r.status === 200,
'correct content-type': (r) =>
(r.headers['Content-Type'] || '').includes('text/event-stream'),
});
// Parse the body that arrived before k6 closed the response.
// For a true streaming parse, replace this block with a ReadableStream
// wrapping a streaming HTTP client (see Step 3).
const lines = (res.body || '').split('\n');
let count = 0;
for (const line of lines) {
if (line.startsWith('data:')) {
count++;
eventLag.add(Date.now() - startMs);
}
}
sseEvents.add(count);
}http_req_waiting is "time spent waiting for response from remote host (a.k.a. 'time to first byte', or 'TTFB')" per the k6 metrics reference. It is the primary latency signal for streaming endpoints because the client must receive the first byte before any event is delivered.
Step 3 - Streaming parse with ReadableStream (optional)
For servers that keep the connection open indefinitely (infinite stream), wrap the body in a ReadableStream so k6 can process events as they arrive and close the stream after N events rather than waiting for the response to complete:
import { ReadableStream } from 'k6/experimental/streams';
import { Counter } from 'k6/metrics';
const sseEvents = new Counter('sse_events_received');
const TARGET_EVENTS = 10; // drain after this many, then move on
export default async function () {
// Open the SSE connection
const res = http.get('http://localhost:3000/api/events', {
headers: { Accept: 'text/event-stream' },
timeout: '120s',
responseType: 'none', // discard buffered body; we read from the stream
});
let buffer = '';
let received = 0;
const stream = new ReadableStream({
async pull(controller) {
// In a real integration, wire this to the chunked HTTP response.
// k6's http module does not expose a streaming body reader natively;
// for production use, combine with an xk6 extension or poll a shared
// channel between the VU and a background goroutine.
if (received >= TARGET_EVENTS) {
controller.close();
return;
}
// Simulate processing: parse lines from accumulated buffer
// Replace this with actual chunk reads from your transport layer.
controller.enqueue(buffer);
},
});
const reader = stream.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const lines = (value || '').split('\n');
for (const line of lines) {
if (line.startsWith('data:')) {
received++;
sseEvents.add(1);
}
}
}
}Note: ReadableStream is experimental and "may introduce breaking changes in future releases" per the k6 streams docs. Pin your k6 version in CI.
Step 4 - Measure concurrent-stream capacity
Ramp VUs upward in stages to find the inflection point where http_req_blocked climbs (TCP slot exhaustion on the server's accept queue) or data_received per VU drops (back-pressure):
export const options = {
scenarios: {
ramp_streams: {
executor: 'ramping-vus',
startVUs: 10,
stages: [
{ duration: '30s', target: 100 },
{ duration: '60s', target: 500 },
{ duration: '60s', target: 1000 },
{ duration: '30s', target: 0 },
],
},
},
thresholds: {
'http_req_waiting': ['p(95)<500'],
'http_req_blocked': ['p(99)<100'],
'http_req_failed': ['rate<0.02'],
'data_received': ['count>0'],
},
};vus (current active VUs) and vus_max (peak concurrent VUs) appear in the k6 summary and Grafana dashboard automatically; no custom metric needed (k6 metrics reference).
Step 5 - Measure connection churn
Connection churn (clients that connect, receive a few events, then disconnect and reconnect) stresses the server's connection-setup path more than a stable pool. Model churn with short-lived iterations and a reconnect loop:
export const options = {
scenarios: {
churn: {
executor: 'constant-arrival-rate',
rate: 20, // 20 new SSE connections per second
timeUnit: '1s',
duration: '60s',
preAllocatedVUs: 40,
},
},
};
export default function () {
// Each iteration: connect, receive 3 events, disconnect.
const res = http.get('http://localhost:3000/api/events', {
headers: { Accept: 'text/event-stream' },
timeout: '10s',
});
check(res, { 'status 200': (r) => r.status === 200 });
// Rapid disconnect after partial read - stresses server close path
}http_req_connecting ("time spent establishing TCP connection to the remote host") will reveal whether TLS+TCP handshake cost dominates at high churn rates (k6 metrics reference).
Step 6 - Run and interpret results
k6 run sse-load.jsKey output fields:
| Metric | What it tells you |
|---|---|
http_req_waiting p(95) | TTFB for the first event; high values mean server event-loop saturation |
http_req_blocked p(99) | Time waiting for a free TCP slot; spikes mean connection exhaustion |
http_req_connecting avg | Per-connection handshake cost; high under churn means TLS overhead |
data_received total | Aggregate byte throughput; divide by duration and VU count for per-stream rate |
sse_events_received | Custom counter; divide by vus to verify events are flowing to all streams |
http_req_failed rate | Unexpected closes or non-200 responses under load |
A passing run shows http_req_blocked near 0 (no TCP slot contention), http_req_waiting within TTFB threshold, and sse_events_received growing linearly with VU count.
Step 7 - HTTP/1.1 vs HTTP/2 comparison
Run the same scenario against the HTTP/1.1 and HTTP/2 endpoints. On HTTP/1.1, http_req_blocked will rise as concurrent VUs approach the server's accept queue depth. On HTTP/2, the single multiplexed TCP connection (per RFC 9113) means http_req_blocked stays near 0 and http_req_connecting drops sharply because new streams reuse the existing connection rather than performing a fresh TCP+TLS handshake.
# HTTP/1.1 target
k6 run -e TARGET=http://localhost:3000/api/events sse-load.js
# HTTP/2 target (same VU count)
k6 run -e TARGET=https://localhost:3000/api/events sse-load.jsCompare the http_req_connecting and http_req_blocked summaries.
Example output (passing run, 50 VUs, HTTP/2)
scenarios: (100.00%) 1 scenario, 50 max VUs, 1m30s max duration
default: 50 looping VUs for 1m0s (gracefulStop: 30s)
http_req_blocked............: avg=1.2ms p(99)=8ms
http_req_connecting.........: avg=3.1ms p(95)=12ms
http_req_waiting............: avg=42ms p(95)=180ms
http_req_failed.............: 0.00%
data_received...............: 14 MB 230 kB/s
sse_events_received.........: 4800
thresholds:
http_req_waiting p(95)<500 - OK
http_req_blocked p(99)<100 - OK
http_req_failed rate<0.01 - OKAnti-patterns
| Anti-pattern | Problem | Fix |
|---|---|---|
| Single short-duration iteration per VU | Does not model a persistent connection; misses steady-state memory growth | Use constant-vus with a multi-minute duration |
| No TTFB threshold | First-byte latency regression goes undetected | Gate http_req_waiting p(95) |
Ignoring http_req_blocked | TCP slot exhaustion masked by passing error rate | Add http_req_blocked p(99) threshold |
| Testing HTTP/1.1 only | Misses multiplexing benefit; may over-provision TCP connections | Run the scenario against both HTTP/1.1 and HTTP/2 (Step 7) |
| Hard-coding VU count without a ramp | Misses the capacity cliff; first overload is discovered in production | Use ramping-vus to find the inflection point (Step 4) |