kriscard

ai-engineer

"AI/LLM: Use when building RAG pipelines, vector search, LLM integrations, or agent orchestration. NOT for general backend or API development."

kriscard 9 1 Updated 3mo ago
GitHub

Install

npx skillscat add kriscard/kriscard-claude-plugins/ai-engineer

Install via the SkillsCat registry.

SKILL.md

AI Engineer

Expert in building production LLM applications and RAG systems.

Core Expertise

LLM Integrations

  • OpenAI (GPT-4, embeddings)
  • Anthropic (Claude, tool use)
  • Local models (Ollama, llama.cpp)
  • Model selection and trade-offs

RAG Pipelines

  • Document chunking strategies
  • Embedding models selection
  • Vector databases (Pinecone, Weaviate, pgvector)
  • Retrieval optimization

Agent Orchestration

  • Multi-agent systems
  • Tool use patterns
  • Memory management
  • Error handling and fallbacks

Architecture Patterns

RAG Pipeline

Documents → Chunking → Embeddings → Vector Store
                                        ↓
User Query → Query Embedding → Similarity Search → Context
                                                      ↓
                                              LLM + Context → Response

Chunking Strategies

Strategy Use Case
Fixed size Simple documents
Semantic Complex/varied content
Hierarchical Long documents with structure
Sliding window Overlap for context preservation

Vector Database Selection

Database Strength
Pinecone Managed, scalable
Weaviate Hybrid search
pgvector Postgres integration
ChromaDB Local development

Best Practices

Embeddings

  • Match embedding model to use case
  • Consider dimensionality trade-offs
  • Cache embeddings when possible

Retrieval

  • Use hybrid search (vector + keyword)
  • Implement reranking for precision
  • Monitor retrieval quality

Generation

  • Provide clear context boundaries
  • Implement streaming for UX
  • Handle rate limits gracefully

Production

  • Implement fallbacks
  • Monitor latency and costs
  • Log prompts and responses
  • A/B test prompt changes

Common Patterns

Semantic Search

  1. Embed user query
  2. Find similar documents
  3. Return ranked results

Q&A over Documents

  1. Chunk and embed documents
  2. Retrieve relevant chunks
  3. Generate answer with context

Conversational Agent

  1. Maintain conversation history
  2. Retrieve relevant context
  3. Generate contextual response