LangChain - LLM Applications with Agents & RAG
The most popular framework for building LLM-powered applications.
When to Use
- Building agents with tool calling and reasoning (ReAct pattern)
- Implementing RAG (retrieval-augmented generation) pipelines
- Need to swap LLM providers easily (OpenAI, Anthropic, Google)
- Creating chatbots with conversation memory
- Rapid prototyping of LLM applications
Core Components
| Component |
Purpose |
Key Concept |
| Chat Models |
LLM interface |
Unified API across providers |
| Agents |
Tool use + reasoning |
ReAct pattern |
| Chains |
Sequential operations |
Composable pipelines |
| Memory |
Conversation state |
Buffer, summary, vector |
| Retrievers |
Document lookup |
Vector search, hybrid |
| Tools |
External capabilities |
Functions agents can call |
Agent Patterns
| Pattern |
Description |
Use Case |
| ReAct |
Reason-Act-Observe loop |
General tool use |
| Plan-and-Execute |
Plan first, then execute |
Complex multi-step |
| Self-Ask |
Generate sub-questions |
Research tasks |
| Structured Chat |
JSON tool calling |
API integration |
Tool Definition
| Element |
Purpose |
| Name |
How agent refers to tool |
| Description |
When to use (critical for selection) |
| Parameters |
Input schema |
| Return type |
What agent receives back |
Key concept: Tool descriptions are critical—the LLM uses them to decide which tool to call. Be specific about when and why to use each tool.
RAG Pipeline Stages
| Stage |
Purpose |
Options |
| Load |
Ingest documents |
Web, PDF, GitHub, DBs |
| Split |
Chunk into pieces |
Recursive, semantic |
| Embed |
Convert to vectors |
OpenAI, Cohere, local |
| Store |
Index vectors |
Chroma, FAISS, Pinecone |
| Retrieve |
Find relevant chunks |
Similarity, MMR, hybrid |
| Generate |
Create response |
LLM with context |
Chunking Strategies
| Strategy |
Best For |
Typical Size |
| Recursive |
General text |
500-1000 chars |
| Semantic |
Coherent passages |
Variable |
| Token-based |
LLM context limits |
256-512 tokens |
Retrieval Strategies
| Strategy |
How It Works |
| Similarity |
Nearest neighbors by embedding |
| MMR |
Diversity + relevance balance |
| Hybrid |
Keyword + semantic combined |
| Self-query |
LLM generates metadata filters |
Memory Types
| Type |
Stores |
Best For |
| Buffer |
Full conversation |
Short conversations |
| Window |
Last N messages |
Medium conversations |
| Summary |
LLM-generated summary |
Long conversations |
| Vector |
Embedded messages |
Semantic recall |
| Entity |
Extracted entities |
Track facts about people/things |
Key concept: Buffer memory grows unbounded. Use summary or vector for long conversations to stay within context limits.
Document Loaders
| Source |
Loader Type |
| Web pages |
WebBaseLoader, AsyncChromium |
| PDFs |
PyPDFLoader, UnstructuredPDF |
| Code |
GitHubLoader, DirectoryLoader |
| Databases |
SQLDatabase, Postgres |
| APIs |
Custom loaders |
Vector Stores
| Store |
Type |
Best For |
| Chroma |
Local |
Development, small datasets |
| FAISS |
Local |
Large local datasets |
| Pinecone |
Cloud |
Production, scale |
| Weaviate |
Self-hosted/Cloud |
Hybrid search |
| Qdrant |
Self-hosted/Cloud |
Filtering, metadata |
LangSmith Observability
| Feature |
Benefit |
| Tracing |
See every LLM call, tool use |
| Evaluation |
Test prompts systematically |
| Datasets |
Store test cases |
| Monitoring |
Track production performance |
Key concept: Enable LangSmith tracing early—debugging agents without observability is extremely difficult.
Best Practices
| Practice |
Why |
| Start simple |
create_agent() covers most cases |
| Enable streaming |
Better UX for long responses |
| Use LangSmith |
Essential for debugging |
| Optimize chunk size |
500-1000 chars typically works |
| Cache embeddings |
They're expensive to compute |
| Test retrieval separately |
RAG quality depends on retrieval |
LangChain vs LangGraph
| Aspect |
LangChain |
LangGraph |
| Best for |
Quick agents, RAG |
Complex workflows |
| Code to start |
<10 lines |
~30 lines |
| State management |
Limited |
Native |
| Branching logic |
Basic |
Advanced |
| Human-in-loop |
Manual |
Built-in |
Key concept: Use LangChain for straightforward agents and RAG. Use LangGraph when you need complex state machines, branching, or human checkpoints.
Resources