backend-architect

Backend architecture expert for API design, database modeling, auth patterns, caching, and system design. Trigger: When designing APIs, choosing databases, planning auth systems, scaling backends, or making architecture decisions.

kozz36 0 Updated 2mo ago

GitHub

Install

npx skillscat add kozz36/backend-architect

Install via the SkillsCat registry.

SKILL.md

When to Use

Designing a new API (REST, GraphQL, gRPC, tRPC)
Choosing a database, ORM, or sync engine for a project
Planning authentication and authorization systems
Scaling a backend or deciding between monolith and microservices
Implementing caching, background jobs, or event-driven patterns
Setting up observability, health checks, or load testing
Designing AI/vector infrastructure or agentic orchestration
Planning local-first or edge-synchronized architectures
Any architectural decision with long-term tradeoffs

1. Framework Selection

Decision Matrix

Scenario	Framework	Reason
Python — general API	FastAPI	Async, OpenAPI auto-gen, Pydantic validation layers
Python — high performance	Litestar	Faster than FastAPI, stricter typing
Python — content/admin heavy	Django + DRF	ORM, admin, batteries included
Node.js — general API	Fastify	Fast, schema-based, great plugin ecosystem
Node.js — edge / serverless	Hono	Ultra-lightweight, multi-runtime
Node.js — Bun runtime	Elysia	Native Bun, end-to-end types
Infrastructure / CLI / binary	Go (Echo/Chi)	Single binary, low memory, extreme concurrency
Cost-critical at massive scale	Rust (Axum)	Max performance, zero GC pauses, deterministic memory safety

Team-size rule

Solo / startup → FastAPI or Fastify (fast iteration)
Team with TypeScript frontend → Fastify + tRPC (shared types)
Team preferring Go → Echo or Chi for REST, gRPC for internal services
Mixed team → FastAPI (Python ubiquity) or Fastify (JS familiarity)

2. Database Architecture

Default Choices

Need	Choice
Primary relational store	PostgreSQL — JSONB, arrays, full-text search, extensions
Embedded / edge / serverless	SQLite — with Litestream (replication), rqlite (Raft), or Turso (distributed libSQL)
Cache / queue / pub-sub / multi-tier memory	Redis
Document store (justified)	MongoDB — only when schema is genuinely variable
Time-series	TimescaleDB (PostgreSQL extension) or InfluxDB
Vector search / semantic retrieval	pgvector (PostgreSQL extension) or Pinecone / Milvus
Local-first sync engine	PowerSync, Turso, ElectricSQL, or Replicache

PostgreSQL as Universal Store

PostgreSQL now serves transactional, relational, analytical, AND semantic workloads:

JSONB: schemaless patterns inside ACID transactions
Full-text search: native tsvector/tsquery
pgvector: first-class vector support (cosine distance, inner product / dot product, HNSW/IVF indexing)
TimescaleDB: hypertables for time-series data

Use pgvector as the default vector strategy for teams already operating massive PostgreSQL estates. Eliminates data synchronization overhead with external vector stores.

SQLite Renaissance (2024–2026)

SQLite is now production-viable for distributed and edge workloads:

Litestream: streams WAL to S3/GCS — free disaster recovery
Turso: libSQL fork, edge-deployed, HTTP API, per-tenant DBs, embedded read replicas at 0ms latency
rqlite: clustered SQLite with Raft — replicated, lightweight
PowerSync: bidirectional delta sync between PostgreSQL central and local SQLite via persistent WebSockets

Use when: single-region, low-concurrency writes, cost-sensitive, per-tenant isolation, or local-first architectures.

Local-First & Edge Synchronization

Sync Engine	Architecture	Best For
PowerSync	PostgreSQL ↔ SQLite via WebSocket delta sync, automatic conflict resolution	Complex apps requiring strict referential integrity on client and server
Turso	libSQL edge deployment with embedded read replicas, branching workflows	Portability and 0ms read latency at the edge
ElectricSQL	Active sync from PostgreSQL to local clients, deterministic conflict resolution	Rapid prototyping and migration of existing Postgres apps to offline-first
Replicache	In-memory transactional store over IndexedDB, explicit sync control, undo/redo	Document editors, collaboration tools (limit ~100MB data)

Vector Databases & RAG Infrastructure

Solution	Deployment	Best For
pgvector	PostgreSQL native extension	Teams already on Postgres; ACID semantic + relational in one store
Pinecone	Cloud-native, serverless SaaS	Enterprise speed-to-market; zero infrastructure maintenance
Milvus	Open-source, Kubernetes-native	Massive-scale similarity search (image search, bioinformatics)
Weaviate	Open-source, schema-flexible with optional strict mode	Multimodal apps combining semantic + lexical search
Chroma	Lightweight, embedded or client-server	Rapid LLM prototyping, LangChain integration

Key algorithms: HNSW (probabilistic multilayer graph navigation), IVF (inverted file index), Product Quantization (memory compression).

Dimensionality tradeoff: 1536-dim (OpenAI text-embedding-3-small) for deep semantic accuracy; 384-dim (all-MiniLM-L6-v2) trades semantic depth for significantly lower latency and memory. Actual speedup depends on index parameters — benchmark with your dataset.

ORM vs Raw SQL

Schema-first, complex queries → Raw SQL (psycopg3, asyncpg) — MUST use parameterized statements: text("... WHERE id = :id").bindparams(id=user_id)
Rapid prototyping, teams      → ORM
Mixed (recommended)           → ORM for CRUD, raw SQL for reports/analytics

ORM Choices

Ecosystem	ORM	Notes
Python	SQLAlchemy 2.0	Mature, async support, unit-of-work, Core expressions
Python (FastAPI)	SQLModel	SQLAlchemy + Pydantic, less boilerplate
Node.js (TypeScript)	Drizzle	Type-safe, SQL-like syntax, no magic, migration lifecycle
Node.js (any)	Prisma 7	Schema-first, migrations, great DX

Patterns

Repository Pattern: abstracts data access; inject via DI for testability
Unit of Work: wraps multiple repos in one transaction (SQLAlchemy Session is UoW)
CQRS Light: separate read models (optimized queries) from write models (ORM entities); no full event sourcing needed for most apps

# Repository pattern with SQLAlchemy 2.0
# WARNING: async sessions should set expire_on_commit=False and return DTOs
# to avoid DetachedInstanceError when accessing lazy-loaded attributes outside session scope
class UserRepository:
    def __init__(self, session: AsyncSession):
        self.session = session

    async def get_by_id(self, user_id: UUID) -> User | None:
        result = await self.session.get(User, user_id)
        return result

    async def get_by_email(self, email: str) -> User | None:
        stmt = select(User).where(User.email == email)
        result = await self.session.execute(stmt)
        return result.scalar_one_or_none()

3. API Design

Protocol Decision Matrix

Scenario	Protocol
Public API, third-party consumers	REST + OpenAPI
Frontend-only, TypeScript stack	tRPC (end-to-end types, no schema drift)
Flexible queries, multiple clients	GraphQL (Apollo / Strawberry)
Internal microservice communication	gRPC (protobuf, streaming, low latency)
Real-time / streaming	WebSocket or SSE

REST Best Practices

Versioning: URL-based wins in practice (/api/v1/users) — header versioning hurts caching and tooling.

Pagination:

Offset (?page=2&limit=20): simple, but expensive on large datasets (OFFSET scans)
Cursor (?after=eyJpZCI6MTAwfQ): scalable, consistent under mutations — prefer for feeds and large tables

Rate Limiting: token bucket or sliding window. Headers: RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset (IETF standard).

Webhook Design:

# Idempotency: include event ID, receiver deduplicates
# HMAC signing: sign payload with shared secret
import hmac, hashlib

def verify_webhook(payload: bytes, signature: str, secret: str) -> bool:
    expected = hmac.new(
        secret.encode(), payload, hashlib.sha256
    ).hexdigest()
    return hmac.compare_digest(f"sha256={expected}", signature)

# Retry: exponential backoff (1s, 5s, 30s, 5m, 30m)
# Delivery guarantee: at-least-once — always include event_id for dedup

OpenAPI

FastAPI generates OpenAPI automatically. For other frameworks:

Litestar: built-in OpenAPI
Fastify: @fastify/swagger + @fastify/swagger-ui
Go: swaggo/swag or oapi-codegen

Always include: request/response schemas, error responses (400, 401, 403, 404, 422, 500), auth scheme.

Backend for Frontend (BFF)

The BFF pattern provisions a dedicated backend per client experience (iOS, Web, Android) rather than forcing a single universal API.

BFF Responsibilities:

Aggregating data from dozens of downstream microservices
Translating protocols (e.g., internal gRPC → public JSON)
Hiding internal data schemas from frontend clients (security posture)
Reducing payload size for mobile (bandwidth/battery optimization)

Anti-pattern warning: BFF must remain spartan. Do NOT absorb domain validations or heavy transactional logic — that creates an accidental integration monolith.

Best practice: Write BFF in the same stack as the frontend team (typically TypeScript/Node.js with Fastify/Hono for web clients), host in the same monorepo, and assign ownership to the frontend team. Mobile BFFs (iOS/Android) are usually shared or language-agnostic. Reserve deep microservice design for pure backend engineers.

Protocol shift: Modern BFF increasingly moves from passive HTTP request/response to persistent WebSockets and GraphQL subscriptions for real-time state propagation (inventory, stocks, AI reasoning streams).

4. Auth & Authorization

JWT Patterns

Access token:  short-lived (15 min), stateless, signed
Refresh token: long-lived (7–30 days), stored in DB, rotated on use

Signing algorithm priority: EdDSA (Ed25519) > ES256 (ECDSA P-256) > RS256 (RSA 2048) > HS256 (shared secret — avoid for multi-service)

# FastAPI JWT pattern — HMAC (default, safest for single-service)
from datetime import datetime, timedelta, UTC
import jwt, os, secrets

SECRET_KEY = os.environ["JWT_SECRET_KEY"]  # 32+ bytes for HS256
ALGORITHM = "HS256"

def create_access_token(subject: str) -> str:
    expire = datetime.now(UTC) + timedelta(minutes=15)
    return jwt.encode({"sub": subject, "exp": expire, "aud": "my-api", "iss": "auth-service"}, SECRET_KEY, algorithm=ALGORITHM)

def create_refresh_token(subject: str) -> dict:
    expire = datetime.now(UTC) + timedelta(days=30)
    token = secrets.token_urlsafe(32)
    # Store hash(token) + expiry in DB
    return {"token": token, "expires_at": expire.isoformat()}

EdDSA (Ed25519) for multi-service: Requires PyJWT[crypto] + cryptography. Load PEM key or generate Ed25519PrivateKey object. Sign with jwt.encode(payload, private_key, algorithm="EdDSA").

Refresh token rotation: invalidate old token on use, issue new pair. Track family for breach detection (if rotated token is reused → family was stolen → revoke all).

Auth Protocols

Scenario	Choice
First-party app (SPA/mobile)	OAuth2 + PKCE or JWT (access + refresh)
First-party app (web app)	Sessions (Redis store) or JWT
Third-party OAuth (Google, GitHub)	OAuth2 + PKCE (Authorization Code flow)
Machine-to-machine / S2S	API keys (hashed in DB) or client_credentials OAuth2
Enterprise SSO	SAML or OIDC
Passwordless 2025	Passkeys (WebAuthn) — phishing-resistant, UX win

Sessions vs Tokens

Sessions (server-side):
  + Instant revocation
  + No token size limit
  - Requires shared session store (Redis) for multi-instance
  - Stateful

JWTs (stateless):
  + No shared store needed
  + Works across services
  - Revocation requires denylist (Redis) or short TTL
  - Token size grows with claims

Use sessions for: monolith, traditional web apps, admin panels.
Use JWTs for: microservices, mobile apps, SPAs, multi-tenant APIs.

Authorization

RBAC (Role-Based): roles → permissions. Simple, audit-friendly. Use for most apps.

ABAC (Attribute-Based): policies evaluate attributes (user, resource, environment). Use for complex multi-tenant or compliance-heavy systems.

# FastAPI RBAC dependency — supports multiple roles per user
from fastapi import Depends, HTTPException, status

def require_any_role(*roles: str):
    def dependency(current_user: User = Depends(get_current_user)):
        if not any(r in current_user.roles for r in roles):
            raise HTTPException(status_code=status.HTTP_403_FORBIDDEN)
        return current_user
    return dependency

# Usage
@router.delete("/users/{id}", dependencies=[Depends(require_any_role("admin", "superuser"))])
async def delete_user(id: UUID): ...

Zero-Trust for Autonomous AI

Sandboxed execution environments for agent code
Dynamic credential delegation with least-privilege access
Strictly scoped JWTs for AI-to-API interactions
Audit all agent actions via structured logging

5. Caching

Default Stack

Redis — primary cache, session store, pub/sub, rate limiter, job queue, and multi-tier memory for agent state.

Connection pooling: always use a pool. FastAPI + Redis: redis.asyncio.ConnectionPool with max_connections=20.

Patterns

Pattern	When
Cache-Aside (Lazy)	Read-heavy, tolerate stale data. Check cache → miss → read DB → populate cache
Write-Through	Write to cache and DB together. Consistent but write latency
Write-Behind	Write to cache, async flush to DB. High write throughput, risk of loss
Read-Through	Cache fetches from DB on miss transparently (library handles it)

Invalidation Strategies

TTL-based:        simplest, stale within window — good for reference data
Event-driven:     publish cache-invalidation event on write — consistent, complex
Tag-based:        tag cache keys by entity, flush by tag — good for related data

# Cache-Aside pattern with FastAPI + Redis
# NOTE: UserDTO is a Pydantic model (serialization layer). Never cache SQLAlchemy ORM objects directly.
import json

async def get_user(user_id: UUID, redis: Redis, db: AsyncSession) -> UserDTO:
    cache_key = f"user:{user_id}"
    cached = await redis.get(cache_key)
    if cached:
        return UserDTO.model_validate_json(cached)

    user = await UserRepository(db).get_by_id(user_id)
    if user:
        dto = UserDTO.model_validate(user)
        await redis.setex(cache_key, 300, dto.model_dump_json())  # TTL: 5 min
    return user

6. AI & Agentic Orchestration

Vector Search Implementation (pgvector)

-- Enable extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Table with embeddings
CREATE TABLE documents (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    content TEXT NOT NULL,
    embedding VECTOR(1536),  -- OpenAI text-embedding-3-small default
    metadata JSONB
);

-- HNSW index for approximate nearest neighbor search
CREATE INDEX idx_documents_embedding ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- Semantic search query
-- NOTE: <=> returns cosine DISTANCE (0 = identical, 2 = opposite).
-- Cosine similarity of normalized vectors is in [-1, 1].
-- Use 1 - distance only if you know vectors are normalized; otherwise use distance directly.
-- If vectors aren't normalized, use distance directly and ORDER BY ascending.
SELECT id, content, 1 - (embedding <=> query_embedding) AS similarity
FROM documents
WHERE metadata @> '{"category": "technical"}'
ORDER BY embedding <=> query_embedding
LIMIT 10;

Agent Orchestration Frameworks

Framework	Paradigm	Best For
LangGraph	Stateful graph workflows in Python/TypeScript	Auditable, predictable multi-step LLM pipelines with recoverable memory
Microsoft AutoGen	Multi-agent collaborative dialogue (Python/.NET)	Enterprise production: error handling, distributed gRPC, OpenTelemetry
CrewAI	Role-driven agent teams (Python)	Rapid prototyping of hierarchical business workflows
Semantic Kernel	Dynamic function binding (.NET/Python)	Azure-native apps with deep semantic memory and deterministic planners

Redis as Multi-Tier Agent Memory

Redis has evolved from peripheral cache to the central nervous system of agent architectures:

Conversation threading: sub-millisecond context retrieval across distributed agents
State compaction: long-term/short-term memory hierarchy management
Vector search: sub-millisecond knowledge base retrieval impacting reasoning coherence
Volatile state management: real-time mutation across agent swarms

7. Architecture Patterns

Monolith First (consensus 2024–2026)

Teams under 20 engineers: start with a monolith. Microservices add operational complexity that kills small teams.

< 20 engineers    → Monolith (deploy as one unit)
< 5 services      → Modular monolith (internal modules, shared DB)
> 5 services, clear bounded contexts → Microservices (justify each split)
> 50 engineers, org boundaries → Microservices (Conway's Law)

Modular Monolith Structure

src/
├── users/          # bounded context
│   ├── domain/     # entities, value objects
│   ├── app/        # use cases, services
│   ├── infra/      # repository impl, external calls
│   └── api/        # routes, schemas
├── billing/
├── notifications/
└── shared/         # cross-cutting (auth, db, config)

Modules communicate via interfaces (not direct imports across layers). This allows future extraction to services.

Clean / Hexagonal Architecture

Domain (entities, value objects) — no framework dependencies
Application (use cases) — depends on domain + ports (interfaces)
Infrastructure (adapters) — implements ports: DB, HTTP, queue
API (delivery) — FastAPI routes call application use cases

# Port (interface) in application layer
from abc import ABC, abstractmethod

class UserRepository(ABC):
    @abstractmethod
    async def get_by_id(self, user_id: UUID) -> User | None: ...
    @abstractmethod
    async def save(self, user: User) -> None: ...

# Adapter (implementation) in infrastructure layer
class PostgresUserRepository(UserRepository):
    def __init__(self, session: AsyncSession):
        self.session = session

    async def get_by_id(self, user_id: UUID) -> User | None:
        return await self.session.get(UserModel, user_id)

Event-Driven

Tool	Use When
Redis Streams	Simple pub/sub, same infra as cache, at-least-once via consumer groups and XACK
NATS	Lightweight, JetStream for persistence, fast fan-out
RabbitMQ	Complex routing, existing AMQP ecosystem
Kafka	High throughput, event log, replay, analytics pipeline

Rule: start with Redis Streams. Migrate to Kafka when you need replay, audit log, or > 100k msg/s.

Background Jobs & Resilient Workflows

Tool	Ecosystem	Use When
Dramatiq	Python	Preferred over Celery: simpler, reliable, middleware-based
ARQ	Python	Async-native, Redis-backed, lightweight
BullMQ	Node.js	Redis-backed, delayed jobs, rate limiting built-in
Temporal	Any	Long-running workflows, sagas, crash-proof execution
Inngest	Node.js	Serverless-friendly, event-driven functions

Temporal: Represents a qualitative leap for mission-critical systems. It abstracts state management and retry routing, promising "crash-proof execution" — workflows resume via deterministic replay from event history, not by serializing native stack frames. This survives outages from seconds to months.

# Dramatiq example (Python)
import dramatiq
from dramatiq.brokers.redis import RedisBroker

broker = RedisBroker(host="localhost", port=6379, db=0)
dramatiq.set_broker(broker)

@dramatiq.actor(max_retries=3, min_backoff=1000)
def send_welcome_email(user_id: str) -> None:
    user = get_user(user_id)
    email_service.send(user.email, "Welcome!")

8. Testing

Test Pyramid

Unit tests (70%)      — domain logic, pure functions, no I/O
Integration tests (20%) — real DB, real Redis, Testcontainers
E2E / load tests (10%)  — full stack, production-like

Rule: mock external HTTP calls (httpx mock, respx), but use REAL databases in integration tests. Mocking Postgres is lying to yourself.

Testcontainers

# pytest + testcontainers for real PostgreSQL
import pytest, pytest_asyncio
from testcontainers.postgres import PostgresContainer
from sqlalchemy.ext.asyncio import create_async_engine
from sqlalchemy import URL

@pytest.fixture(scope="session")
def postgres():
    with PostgresContainer("postgres:17-alpine") as pg:
        yield pg

@pytest_asyncio.fixture(scope="session")
async def engine(postgres):
    # Build URL safely instead of naive string replacement
    url = URL.create(
        drivername="postgresql+asyncpg",
        username=postgres.username,
        password=postgres.password,
        host=postgres.get_container_host_ip(),
        port=postgres.get_exposed_port(5432),
        database=postgres.dbname,
    )
    engine = create_async_engine(url)
    async with engine.begin() as conn:
        await conn.run_sync(Base.metadata.create_all)
    yield engine
    await engine.dispose()

Factory Pattern

Use factories (not fixtures with hardcoded data) to build test objects:

# factory-boy example
from uuid import uuid4
import factory
from factory.alchemy import SQLAlchemyModelFactory

class UserFactory(SQLAlchemyModelFactory):
    class Meta:
        model = User

    id = factory.LazyFunction(uuid4)
    email = factory.Sequence(lambda n: f"user{n}@example.com")
    name = factory.Faker("name")
    role = "user"

Load Testing & Chaos Engineering

k6 is the default. Scripted in JS/TS, integrates with CI, outputs InfluxDB/Prometheus metrics.

// k6 smoke test
import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = { vus: 10, duration: '30s' };

export default function () {
  const res = http.get('http://localhost:8000/api/v1/users');
  check(res, { 'status is 200': (r) => r.status === 200 });
  sleep(1);
}

k6 disciplines: smoke testing, stress testing (breakpoint detection), soak testing (memory leak detection), spike testing, browser emulation (POM), and Kubernetes chaos injection (xk6-disruptor).

9. Platform Engineering & Observability

Infrastructure as Code (IaC)

Layer	Technology	Purpose
Provisioning	Terraform / OpenTofu	Immutable cloud topology, GPU instance management
GitOps / CD	Argo CD / GitLab CI	Git as single source of truth; automatic drift reconciliation
Secrets Governance	Infisical / ESO	Dynamic injection, rotation, audit; zero `.env` files in runtime
Observability	OpenTelemetry + LGTM	Distributed tracing, metrics, logs, visualization

Golden rule: ALL topology is defined, reviewed, regression-tested, and deployed as declarative code.

Observability Stack

Concern	Tool
Tracing + metrics + logs	OpenTelemetry (vendor-neutral, OTEL SDK)
Structured logging Python	structlog
Structured logging Node.js	pino
Metrics scraping	Prometheus
Error tracking	Sentry
Visualization	Grafana
Log aggregation	Loki
Trace backend	Tempo
Metrics backend	Mimir

OpenTelemetry Setup (FastAPI)

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
from opentelemetry.instrumentation.sqlalchemy import SQLAlchemyInstrumentor

provider = TracerProvider()
provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter()))
trace.set_tracer_provider(provider)

FastAPIInstrumentor.instrument_app(app)
SQLAlchemyInstrumentor().instrument(engine=engine)

Structured Logging

import structlog

log = structlog.get_logger()

# Context is carried through the request
log.info("user.created", user_id=str(user.id), email=user.email, role=user.role)
# Output: {"event": "user.created", "user_id": "...", "email": "...", "role": "user", "timestamp": "..."}

Health Check Endpoints

@app.get("/health/live")
async def liveness():
    """Kubernetes liveness probe — is the process alive?"""
    return {"status": "ok"}

@app.get("/health/ready")
async def readiness(db: AsyncSession = Depends(get_db), redis: Redis = Depends(get_redis)):
    """Kubernetes readiness probe — can it serve traffic?"""
    status = {"status": "ok"}
    try:
        await db.execute(text("SELECT 1"))
        status["db"] = "ok"
    except Exception:
        raise HTTPException(status_code=503, detail="database unavailable")
    try:
        await redis.ping()
        status["cache"] = "ok"
    except Exception:
        # Redis is a soft dependency — log but don't fail readiness
        status["cache"] = "degraded"
    return status

10. Decision Framework

Full architecture flowchart:

1. LANGUAGE
   ├── Python team?        → FastAPI (default) / Litestar (perf) / Django (content)
   ├── JS/TS team?         → Fastify (default) / Hono (edge) / Elysia (Bun)
   ├── Infra / binary?     → Go
   └── Max perf / cost?    → Rust (Axum)

2. TEAM SIZE
   ├── < 20 engineers      → Monolith or Modular Monolith
   └── > 20, org boundaries → Microservices (justify each split)

3. API CONSUMERS
   ├── Public / third-party → REST + OpenAPI
   ├── Own TS frontend only → tRPC
   ├── Multiple varied clients → GraphQL
   └── Internal services → gRPC

4. DATABASE
   ├── Default             → PostgreSQL
   ├── Edge / embedded     → SQLite + Litestream or Turso
   ├── Vector / AI search  → pgvector (existing PG) or Pinecone (new)
   ├── Local-first sync    → PowerSync or Turso
   ├── Variable schema     → MongoDB (justify)
   └── Time-series         → TimescaleDB

5. ORM
   ├── Python + FastAPI    → SQLModel (simple) / SQLAlchemy 2.0 (complex)
   ├── Node.js + TS        → Drizzle (SQL-like) / Prisma (schema-first)
   └── Complex queries     → Raw SQL for those queries

6. AUTH
   ├── Own users, SPA/mobile → JWT (access + refresh, EdDSA)
   ├── Own users, web app  → Sessions (Redis store)
   ├── Third-party login   → OAuth2 + PKCE
   ├── S2S / M2M           → API keys (hashed) or client_credentials
   └── Enterprise          → OIDC / SAML

7. BACKGROUND JOBS
   ├── Python              → Dramatiq (default) / ARQ (async)
   ├── Node.js             → BullMQ
   └── Long workflows / sagas → Temporal

8. CACHING
   ├── Default             → Redis (Cache-Aside pattern)
   ├── Invalidation        → TTL (simple) → event-driven (consistent)
   └── Connection pool     → Always; max_connections tuned to DB pool size

9. AI / AGENTS
   ├── Vector store (PG native) → pgvector
   ├── Vector store (managed)   → Pinecone
   ├── Agent workflows            → LangGraph
   ├── Multi-agent collaboration  → AutoGen
   └── Agent memory hierarchy     → Redis multi-tier

10. PLATFORM / OBSERVABILITY
    ├── IaC                 → Terraform / OpenTofu
    ├── GitOps              → Argo CD
    ├── Secrets             → Infisical / ESO
    └── Observability       → OpenTelemetry + structlog/pino + Sentry + /health/*

Commands

# FastAPI project bootstrap
uv init my-api && cd my-api
uv add fastapi uvicorn[standard] sqlalchemy[asyncio] asyncpg pydantic-settings redis structlog

# Run with hot reload
uvicorn app.main:app --reload --port 8000

# Fastify project bootstrap
npm create fastify@latest my-api -- --lang=ts
npm install @fastify/swagger @fastify/swagger-ui @fastify/rate-limit

# Drizzle ORM setup
npm install drizzle-orm pg && npm install -D drizzle-kit @types/pg
npx drizzle-kit generate && npx drizzle-kit migrate

# Testcontainers (Python)
uv add --dev testcontainers pytest-asyncio factory-boy

# k6 load test
k6 run scripts/load-test.js

# OpenTelemetry auto-instrumentation (Python)
uv add opentelemetry-sdk opentelemetry-instrumentation-fastapi opentelemetry-instrumentation-sqlalchemy opentelemetry-exporter-otlp

# pgvector (PostgreSQL)
# In psql: CREATE EXTENSION vector;
# Via Docker: docker pull ankane/pgvector

# Temporal CLI
brew install temporal  # macOS
# or visit https://docs.temporal.io

# Terraform / OpenTofu
brew install terraform  # or opentofu

Resources

FastAPI docs: https://fastapi.tiangolo.com
SQLAlchemy 2.0: https://docs.sqlalchemy.org/en/20/
Drizzle ORM: https://orm.drizzle.team
Testcontainers Python: https://testcontainers-python.readthedocs.io
OpenTelemetry Python: https://opentelemetry-python.readthedocs.io
k6 docs: https://grafana.com/docs/k6/latest/
Temporal: https://docs.temporal.io
pgvector: https://github.com/pgvector/pgvector
PowerSync: https://www.powersync.com
LangGraph: https://langchain-ai.github.io/langgraph/
AutoGen: https://microsoft.github.io/autogen/
Infisical: https://infisical.com
OpenTofu: https://opentofu.org
Argo CD: https://argo-cd.readthedocs.io

backend-architect

Install

When to Use

1. Framework Selection

Decision Matrix

Team-size rule

2. Database Architecture

Default Choices

PostgreSQL as Universal Store

SQLite Renaissance (2024–2026)

Local-First & Edge Synchronization

Vector Databases & RAG Infrastructure

ORM vs Raw SQL

ORM Choices

Patterns

3. API Design

Protocol Decision Matrix

REST Best Practices

OpenAPI

Backend for Frontend (BFF)

4. Auth & Authorization

JWT Patterns

Auth Protocols

Sessions vs Tokens

Authorization

Zero-Trust for Autonomous AI

5. Caching

Default Stack

Patterns

Invalidation Strategies

6. AI & Agentic Orchestration

Vector Search Implementation (pgvector)

Agent Orchestration Frameworks

Redis as Multi-Tier Agent Memory

7. Architecture Patterns

Monolith First (consensus 2024–2026)

Modular Monolith Structure

Clean / Hexagonal Architecture

Event-Driven

Background Jobs & Resilient Workflows

8. Testing

Test Pyramid

Testcontainers

Factory Pattern

Load Testing & Chaos Engineering

9. Platform Engineering & Observability

Infrastructure as Code (IaC)

Observability Stack

OpenTelemetry Setup (FastAPI)

Structured Logging

Health Check Endpoints

10. Decision Framework

Commands

Resources

Categories

Install

Recommended Skills