apankov1

websocket-client-resilience

"Client-side WebSocket resilience patterns: backoff with jitter, circuit breakers, heartbeat hysteresis, command acknowledgment, sequence gap detection, and mobile-aware timeouts."

apankov1 5 Updated 3mo ago

Resources

3
GitHub

Install

npx skillscat add apankov1/quality-engineering/websocket-client-resilience

Install via the SkillsCat registry.

SKILL.md

WebSocket Client Resilience

6 resilience patterns for WebSocket clients, extracted from real-world mobile network conditions.

Mobile WebSocket connections fail in ways that local development environments don't surface. P99 latency on 4G networks is 5-8 seconds. A 5-second health check timeout causes false positives on every slow network.

When to use: Implementing WebSocket client reconnection logic, building real-time features with persistent connections, mobile app WebSocket handling, any client that maintains long-lived server connections.

When not to use: Server-side WebSocket handlers, HTTP request/response patterns, Server-Sent Events (SSE).

Rationalizations (Do Not Skip)

Rationalization Why It's Wrong Required Action
"Our users are on fast networks" Mobile users exist. Even desktop WiFi has transient blips. Test with throttled networks
"Simple retry is enough" Without jitter, all clients retry at once after an outage Add randomized jitter
"One missed heartbeat means disconnected" Network blips last 1-3 seconds. Single miss = false positive. Use hysteresis (2+ misses)
"We'll add resilience later" Reconnection logic is foundational. Retrofitting it is much harder. Build it in from the start
"5 seconds is plenty of timeout" Mobile P99 is 5-8s. That "timeout" is normal latency for mobile. Use 10s+ for mobile

Included Utilities

// WebSocket resilience pattern implementations (zero dependencies)
import {
  getBackoffDelay,
  circuitBreakerTransition,
  shouldDisconnect,
  CommandAckTracker,
  detectSequenceGap,
  classifyTimeout,
} from './resilience.ts';

Quick Reference

Pattern Detect Fix Severity
Backoff without jitter Math.pow(2, attempt) without Math.random() Add +/- 25% jitter must-fail
No circuit breaker Reconnect without failure counter Trip after 5 failures, 60s cooldown must-fail
Single heartbeat miss setTimeout disconnect without miss counter Require 2+ missed heartbeats should-fail
No command ack ws.send() without commandId tracking Track pending commands, timeout at 30s nice-to-have
No sequence tracking onmessage without sequence check Track lastReceivedSequence, detect gaps nice-to-have
Short mobile timeout Health timeout < 10s Use 10s+ for all health checks must-fail

Coverage

Pattern Utility Status
1. Backoff with jitter getBackoffDelay() Code + tests
2. Circuit breaker circuitBreakerTransition() Code + tests
3. Heartbeat hysteresis shouldDisconnect() Code + tests
4. Command acknowledgment CommandAckTracker Code + tests
5. Sequence gap detection detectSequenceGap() Code + tests
6. Mobile-aware timeouts classifyTimeout() Code + tests

All 6 patterns have executable utilities and tests.

Framework Adaptation

These patterns are framework-agnostic. They work with:

  • Browser: Native WebSocket, Socket.IO, ws library
  • React/Vue/Svelte: Wrap in composable/hook
  • React Native / Flutter: Same patterns, different APIs
  • Node.js: ws library for server-to-server WebSocket clients

The core principle: real-world network conditions are more variable than controlled environments. Design for mobile latency, not localhost.

See patterns.md for full before/after code examples and detection commands for each pattern.