pluginagentmarketplace

monitoring

Game server monitoring with metrics, alerting, and performance tracking for production reliability

pluginagentmarketplace 1 1 Updated 5mo ago

Resources

3
GitHub

Install

npx skillscat add pluginagentmarketplace/custom-plugin-server-side-game-dev/monitoring

Install via the SkillsCat registry.

SKILL.md

Server Monitoring

Monitor game server health with metrics, logs, and alerts.

Key Game Metrics

const prometheus = require('prom-client');

// Player metrics
const activePlayers = new prometheus.Gauge({
  name: 'game_active_players',
  help: 'Currently connected players',
  labelNames: ['region', 'game_mode']
});

const matchesInProgress = new prometheus.Gauge({
  name: 'game_matches_active',
  help: 'Active matches',
  labelNames: ['game_mode']
});

// Performance metrics
const tickDuration = new prometheus.Histogram({
  name: 'game_tick_duration_seconds',
  help: 'Game loop tick duration',
  buckets: [0.001, 0.005, 0.01, 0.016, 0.033]
});

const networkLatency = new prometheus.Histogram({
  name: 'game_network_latency_ms',
  help: 'Player network latency',
  labelNames: ['region'],
  buckets: [10, 25, 50, 75, 100, 150, 200]
});

Alert Rules

groups:
- name: game-alerts
  rules:
  - alert: GameServerDown
    expr: up{job="game-servers"} == 0
    for: 1m
    labels:
      severity: critical

  - alert: HighTickLatency
    expr: histogram_quantile(0.99, game_tick_duration_seconds) > 0.02
    for: 5m
    labels:
      severity: high

  - alert: LowPlayerCount
    expr: game_active_players < 10
    for: 10m
    labels:
      severity: warning

Target Thresholds

Metric Target Alert
Tick Rate 60 Hz < 55 Hz
Latency P99 < 100ms > 200ms
Memory < 80% > 90%
CPU < 70% > 85%

Troubleshooting

Common Failure Modes

Error Root Cause Solution
Missing metrics Scrape failure Check targets
Alert storms Too sensitive Tune thresholds
Dashboard slow Too many queries Aggregate
Gaps in data Network issues Add redundancy

Debug Checklist

# Check Prometheus targets
curl localhost:9090/api/v1/targets | jq '.data.activeTargets'

# Check firing alerts
curl localhost:9090/api/v1/alerts | jq '.data.alerts'

# Query metrics
curl 'localhost:9090/api/v1/query?query=game_active_players'

Unit Test Template

describe('Metrics', () => {
  test('records tick duration', async () => {
    const end = tickDuration.startTimer();
    await sleep(10);
    end();

    const metrics = await prometheus.register.metrics();
    expect(metrics).toContain('game_tick_duration_seconds');
  });
});

Resources

  • assets/ - Dashboard configs
  • references/ - Alerting guides