Incidents

Post-incident reports for service disruptions.


2026-02-27 — Game State Loss (Redis Restart)

Duration: ~06:37 UTC until manual detection Severity: High Impact: All in-progress games lost

What happened

Redis restarted during an unattended-upgrades cycle at 06:37 UTC. Since game state lived only in Redis with no durable backup, every active game was wiped. The cleanup cron then marked all affected tables as expired.

Decks, accounts, and match history were unaffected (stored in Postgres).

Root cause

  • Redis had no persistence (no RDB/AOF) — a restart means total data loss
  • No backup layer existed for game state
  • needrestart was configured to auto-restart services including Redis

Remediation

  • Disabled unattended-upgrades to prevent uncontrolled service restarts
  • Enabled Redis RDB persistence (save 60 1) — worst-case data loss reduced from "entire game" to "last 60 seconds of actions"
Cookie Preferences
We use cookies to enhance your experience. Essential cookies are required for core functionality. You can customize your preferences or accept all cookies.