infrastructure-monitor

Set up monitoring, logging, and alerting for infrastructure and applications. Use when implementing observability, creating dashboards, or configuring alerts.

armanzeroeight 29 7 Updated 8mo ago

GitHub

Install

npx skillscat add armanzeroeight/fastagent-plugins/infrastructure-monitor

Install via the SkillsCat registry.

SKILL.md

Infrastructure Monitor

Set up comprehensive monitoring and observability.

Quick Start

Use Prometheus for metrics, Grafana for dashboards, Loki for logs, set up alerts for critical issues.

Instructions

Metrics with Prometheus

Application instrumentation:

const prometheus = require('prom-client');

const httpRequestDuration = new prometheus.Histogram({
  name: 'http_request_duration_seconds',
  help: 'Duration of HTTP requests in seconds',
  labelNames: ['method', 'route', 'status_code']
});

app.use((req, res, next) => {
  const start = Date.now();
  res.on('finish', () => {
    const duration = (Date.now() - start) / 1000;
    httpRequestDuration.labels(req.method, req.route?.path, res.statusCode).observe(duration);
  });
  next();
});

Prometheus config:

scrape_configs:
  - job_name: 'app'
    static_configs:
      - targets: ['app:3000']
    scrape_interval: 15s

Dashboards with Grafana

Key metrics to monitor:

Request rate (requests/second)
Error rate (errors/total requests)
Response time (p50, p95, p99)
CPU and memory usage
Database query time

Logging with Loki

Structured logging:

const winston = require('winston');

const logger = winston.createLogger({
  format: winston.format.json(),
  transports: [
    new winston.transports.Console()
  ]
});

logger.info('User logged in', { userId: user.id, ip: req.ip });

Alerting

Alert rules:

groups:
  - name: app_alerts
    rules:
      - alert: HighErrorRate
        expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05
        for: 5m
        annotations:
          summary: "High error rate detected"

Best Practices

Monitor golden signals (latency, traffic, errors, saturation)
Set up actionable alerts
Use log aggregation
Implement distributed tracing
Create runbooks for alerts
Regular dashboard reviews

infrastructure-monitor

Install

Infrastructure Monitor

Quick Start

Instructions

Metrics with Prometheus

Dashboards with Grafana

Logging with Loki

Alerting

Best Practices

Categories

Install

Recommended Skills