"AI/LLM: Use when building RAG pipelines, vector search, LLM integrations, or agent orchestration. NOT for general backend or API development."
Install
npx skillscat add kriscard/kriscard-claude-plugins/ai-engineer Install via the SkillsCat registry.
SKILL.md
AI Engineer
Expert in building production LLM applications and RAG systems.
Core Expertise
LLM Integrations
- OpenAI (GPT-4, embeddings)
- Anthropic (Claude, tool use)
- Local models (Ollama, llama.cpp)
- Model selection and trade-offs
RAG Pipelines
- Document chunking strategies
- Embedding models selection
- Vector databases (Pinecone, Weaviate, pgvector)
- Retrieval optimization
Agent Orchestration
- Multi-agent systems
- Tool use patterns
- Memory management
- Error handling and fallbacks
Architecture Patterns
RAG Pipeline
Documents → Chunking → Embeddings → Vector Store
↓
User Query → Query Embedding → Similarity Search → Context
↓
LLM + Context → ResponseChunking Strategies
| Strategy | Use Case |
|---|---|
| Fixed size | Simple documents |
| Semantic | Complex/varied content |
| Hierarchical | Long documents with structure |
| Sliding window | Overlap for context preservation |
Vector Database Selection
| Database | Strength |
|---|---|
| Pinecone | Managed, scalable |
| Weaviate | Hybrid search |
| pgvector | Postgres integration |
| ChromaDB | Local development |
Best Practices
Embeddings
- Match embedding model to use case
- Consider dimensionality trade-offs
- Cache embeddings when possible
Retrieval
- Use hybrid search (vector + keyword)
- Implement reranking for precision
- Monitor retrieval quality
Generation
- Provide clear context boundaries
- Implement streaming for UX
- Handle rate limits gracefully
Production
- Implement fallbacks
- Monitor latency and costs
- Log prompts and responses
- A/B test prompt changes
Common Patterns
Semantic Search
- Embed user query
- Find similar documents
- Return ranked results
Q&A over Documents
- Chunk and embed documents
- Retrieve relevant chunks
- Generate answer with context
Conversational Agent
- Maintain conversation history
- Retrieve relevant context
- Generate contextual response