dmonteroh

database-cost-optimization

"Reduce database infrastructure spend when costs need optimization by analyzing cost drivers, right-sizing compute/storage/replicas, and proposing verified rollback-ready changes without compromising reliability."

dmonteroh 1 Updated 3mo ago

Resources

1
GitHub

Install

npx skillscat add dmonteroh/curated-agent-skills/database-cost-optimization

Install via the SkillsCat registry.

SKILL.md

database-cost-optimization

Provides guidance to reduce database spend while protecting performance and reliability.

Use this skill when

  • Right-sizing database instances, storage, or connection pools.
  • Reducing backup/retention costs with clear recovery requirements.
  • Evaluating read replicas, HA posture, or IO provisioning costs.
  • Investigating costly queries driving CPU or IO spend.

Do not use this skill when

  • The system is in active incident response.
  • No cost or utilization signals are available and none can be estimated.

Required inputs

  • Database engine and deployment model (managed/self-hosted, region).
  • Current topology (primary/replicas, storage class, backup retention).
  • At least one signal: cost allocation, utilization metrics, or query profile.
  • Reliability requirements (RPO/RTO, HA/SLA, peak windows).

If required inputs are missing, the skill requests them before proceeding.

Workflow

  1. Confirm goals and constraints.

    • Output: target savings range, non-negotiable reliability constraints.
  2. Build a baseline from available signals.

    • Output: baseline table with cost, utilization, storage growth, and peak load.
    • Decision: if baseline data is insufficient to estimate impact, request more data and pause.
  3. Identify primary cost drivers.

    • Output: ranked list of compute, storage, IO, and replica drivers with evidence.
  4. Generate candidate levers by risk tier.

    • Output: low/medium/high-risk candidate actions tied to a driver.
    • Decision: if a lever affects RPO/RTO or peak traffic, mark as high-risk and require rollout gating.
  5. Estimate savings and risk for each lever.

    • Output: savings range, assumptions, and risk classification per change.
  6. Define rollout and verification gates.

    • Output: staged rollout plan, metrics to watch, rollback criteria.
  7. Deliver the final report.

    • Output: recommendations with savings, risks, and verification steps.

Common pitfalls

  • Downscaling without validating peak utilization and burst patterns.
  • Reducing retention without mapping legal or recovery requirements.
  • Removing replicas without confirming read traffic and failover needs.
  • Optimizing queries without verifying index/storage impact.

Examples

Example output (excerpt)

DB Cost Optimization Report
Baseline: $18.2k/mo, CPU p95 42%, storage +9%/mo
Top drivers: oversized primary, unused read replica, long retention
Recommendation 1: downsize primary (savings $2.5k–$3.2k, medium risk)
Verification: canary 10%, watch p95 latency < 50ms, rollback if > 65ms

Output contract

Produces a report with these sections and a consistent format:

DB Cost Optimization Report

Context:
- Goal:
- Constraints:

Baseline:
- Monthly cost:
- Utilization summary:
- Storage growth:
- Peak window:

Cost Drivers (ranked):
- Driver: evidence

Recommendations:
1) Change:
   Driver:
   Expected savings (range):
   Risk level:
   Verification gates:
   Rollback plan:
   Assumptions:

Open Questions:
- ...

Next Steps:
- ...

References

See references/README.md for detailed checklists and lever guidance.

Categories