Guide to using embedding model integrations in LangChain including OpenAI, Azure, and local embeddings
Install
npx skillscat add christian-bromann/langchain-skills/langchain-embeddings Install via the SkillsCat registry.
langchain-embeddings (JavaScript/TypeScript)
Overview
Embedding models convert text into numerical vector representations that capture semantic meaning. These vectors enable semantic search, similarity comparison, and are essential for building RAG (Retrieval-Augmented Generation) systems with vector databases.
Key Concepts
- Embeddings: Dense vector representations of text that encode semantic meaning
- Vector Dimensions: Different models produce vectors of different sizes (e.g., 1536 for OpenAI, 768 for some open-source models)
- Similarity Search: Finding similar texts by comparing vector distances (cosine similarity, euclidean distance)
- Batch Processing: Efficiently embedding multiple texts at once
- Use Cases: Semantic search, document retrieval, clustering, recommendation systems
Provider Selection Decision Table
| Provider | Best For | Model Examples | Dimensions | Package | Key Features |
|---|---|---|---|---|---|
| OpenAI | General purpose, high quality | text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002 | 1536, 3072 | @langchain/openai |
High quality, reliable, flexible dimensions |
| Azure OpenAI | Enterprise, compliance | text-embedding-ada-002 (Azure) | 1536 | @langchain/openai |
Enterprise SLAs, data residency |
| Cohere | Multilingual, search optimization | embed-english-v3.0, embed-multilingual-v3.0 | 1024 | @langchain/cohere |
Search/clustering modes, multilingual |
| HuggingFace | Open source, customizable | all-MiniLM-L6-v2, BGE models | Varies | @langchain/community |
Free, local inference, many models |
| GCP integration | textembedding-gecko | 768 | @langchain/google-genai |
GCP ecosystem, multimodal | |
| Ollama | Local, privacy | llama2, mistral, nomic-embed-text | Varies | @langchain/ollama |
Fully local, no API costs, privacy |
When to Choose Each Provider
Choose OpenAI if:
- You need high-quality embeddings for production
- You want reliable, fast API-based embeddings
- Cost is reasonable for your use case (~$0.13 per 1M tokens)
Choose Azure OpenAI if:
- You need enterprise support and SLAs
- Data compliance/residency is critical
- You're already using Azure infrastructure
Choose Cohere if:
- You need multilingual embeddings
- You want optimized embeddings for search vs. clustering
- You need competitive pricing
Choose HuggingFace if:
- You want to use open-source models
- You need specific model characteristics
- You want to run inference locally or on your own infrastructure
Choose Ollama if:
- Privacy is paramount (fully local)
- You want zero API costs after setup
- You have sufficient local compute resources
Code Examples
OpenAI Embeddings
import { OpenAIEmbeddings } from "@langchain/openai";
// Basic initialization
const embeddings = new OpenAIEmbeddings({
modelName: "text-embedding-3-small",
openAIApiKey: process.env.OPENAI_API_KEY, // Optional if set in env
});
// Embed a single query
const queryEmbedding = await embeddings.embedQuery(
"What is the capital of France?"
);
console.log(`Vector dimensions: ${queryEmbedding.length}`);
console.log(`First few values: ${queryEmbedding.slice(0, 5)}`);
// Embed multiple documents
const documents = [
"Paris is the capital of France.",
"London is the capital of England.",
"Berlin is the capital of Germany.",
];
const docEmbeddings = await embeddings.embedDocuments(documents);
console.log(`Embedded ${docEmbeddings.length} documents`);
// Using newer models with custom dimensions
const smallEmbeddings = new OpenAIEmbeddings({
modelName: "text-embedding-3-small",
dimensions: 512, // Reduce from default 1536 for efficiency
});Azure OpenAI Embeddings
import { AzureOpenAIEmbeddings } from "@langchain/openai";
const embeddings = new AzureOpenAIEmbeddings({
azureOpenAIApiKey: process.env.AZURE_OPENAI_API_KEY,
azureOpenAIApiInstanceName: process.env.AZURE_OPENAI_API_INSTANCE_NAME,
azureOpenAIApiEmbeddingsDeploymentName: "text-embedding-ada-002",
azureOpenAIApiVersion: "2024-02-01",
});
const embedding = await embeddings.embedQuery("Hello world");HuggingFace Embeddings (Local)
import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/hf_transformers";
// Run embeddings locally with Transformers.js
const embeddings = new HuggingFaceTransformersEmbeddings({
modelName: "Xenova/all-MiniLM-L6-v2",
});
const embedding = await embeddings.embedQuery("This runs locally!");Ollama Embeddings (Local)
import { OllamaEmbeddings } from "@langchain/ollama";
// Requires Ollama running locally: ollama pull nomic-embed-text
const embeddings = new OllamaEmbeddings({
model: "nomic-embed-text",
baseUrl: "http://localhost:11434", // Default Ollama URL
});
const embedding = await embeddings.embedQuery("Fully local embeddings");Cohere Embeddings
import { CohereEmbeddings } from "@langchain/cohere";
const embeddings = new CohereEmbeddings({
apiKey: process.env.COHERE_API_KEY,
model: "embed-english-v3.0",
inputType: "search_query", // or "search_document", "classification", "clustering"
});
const queryEmbedding = await embeddings.embedQuery("Search query");
const docEmbeddings = await embeddings.embedDocuments(["doc1", "doc2"]);Computing Similarity
import { OpenAIEmbeddings } from "@langchain/openai";
const embeddings = new OpenAIEmbeddings();
// Embed query and documents
const query = "What is machine learning?";
const docs = [
"Machine learning is a branch of AI",
"Paris is the capital of France",
"Neural networks are used in deep learning",
];
const queryVec = await embeddings.embedQuery(query);
const docVecs = await embeddings.embedDocuments(docs);
// Compute cosine similarity
function cosineSimilarity(vecA: number[], vecB: number[]): number {
const dotProduct = vecA.reduce((sum, a, i) => sum + a * vecB[i], 0);
const magnitudeA = Math.sqrt(vecA.reduce((sum, a) => sum + a * a, 0));
const magnitudeB = Math.sqrt(vecB.reduce((sum, b) => sum + b * b, 0));
return dotProduct / (magnitudeA * magnitudeB);
}
// Find most similar document
const similarities = docVecs.map((docVec) =>
cosineSimilarity(queryVec, docVec)
);
console.log("Similarities:", similarities);
const mostSimilarIdx = similarities.indexOf(Math.max(...similarities));
console.log("Most similar doc:", docs[mostSimilarIdx]);Batch Processing for Efficiency
import { OpenAIEmbeddings } from "@langchain/openai";
const embeddings = new OpenAIEmbeddings({
batchSize: 512, // OpenAI allows up to 2048 in one request
});
// Efficiently embed large document sets
const largeDocSet = Array.from({ length: 1000 }, (_, i) =>
`Document ${i}: Some content here`
);
const docEmbeddings = await embeddings.embedDocuments(largeDocSet);
console.log(`Embedded ${docEmbeddings.length} documents in batches`);Boundaries
What Agents CAN Do
✅ Initialize embedding models
- Set up OpenAI, Azure, Cohere, HuggingFace, or Ollama embeddings
- Configure API keys and model parameters
✅ Embed text content
- Embed single queries with
embedQuery() - Embed multiple documents with
embedDocuments() - Process large batches efficiently
✅ Use embeddings with vector stores
- Pass embeddings to vector store constructors
- Enable semantic search capabilities
✅ Choose appropriate models
- Select based on quality, cost, latency requirements
- Use local models for privacy concerns
✅ Optimize for use case
- Adjust batch sizes for efficiency
- Use smaller dimensions to reduce costs/storage
What Agents CANNOT Do
❌ Modify embedding dimensions arbitrarily
- Cannot change dimensions beyond what the model supports
- text-embedding-3-* models support custom dimensions, older models don't
❌ Mix embeddings from different models
- Cannot compare embeddings from different models directly
- Must use same model for all embeddings in a similarity search
❌ Exceed API rate limits
- Cannot bypass provider rate limits
- Must implement rate limiting for large-scale operations
❌ Generate embeddings without proper authentication
- Cannot use cloud providers without valid API keys
- Cannot access models without proper credentials
Gotchas
1. Model Consistency is Critical
// ❌ BAD: Using different models
const embeddings1 = new OpenAIEmbeddings({
modelName: "text-embedding-3-small"
});
const embeddings2 = new OpenAIEmbeddings({
modelName: "text-embedding-ada-002"
});
const queryVec = await embeddings1.embedQuery("query");
const docVec = await embeddings2.embedQuery("document");
// Similarity comparison will be meaningless!
// ✅ GOOD: Use same model for everything
const embeddings = new OpenAIEmbeddings({
modelName: "text-embedding-3-small"
});
const queryVec = await embeddings.embedQuery("query");
const docVec = await embeddings.embedQuery("document");
// Now similarity makes senseFix: Always use the same embedding model for all texts you want to compare.
2. Batch Size Limits
// ❌ Potential API error with too many docs
const embeddings = new OpenAIEmbeddings();
const hugeDocs = Array(5000).fill("text");
await embeddings.embedDocuments(hugeDocs); // May fail!
// ✅ Configure appropriate batch size
const embeddings = new OpenAIEmbeddings({
batchSize: 512, // OpenAI limit is 2048, use smaller for safety
});
await embeddings.embedDocuments(hugeDocs); // Handles batching automaticallyFix: Set appropriate batchSize parameter for the provider.
3. API Keys in Environment
// ❌ Hardcoded API key
const embeddings = new OpenAIEmbeddings({
openAIApiKey: "sk-...", // Never commit this!
});
// ✅ Use environment variables
const embeddings = new OpenAIEmbeddings({
openAIApiKey: process.env.OPENAI_API_KEY,
});
// ✅ Even better: auto-detection
const embeddings = new OpenAIEmbeddings();
// Reads OPENAI_API_KEY from environment automaticallyFix: Use environment variables for API keys.
4. Text Length Limits
// ❌ Text too long
const embeddings = new OpenAIEmbeddings();
const veryLongText = "...".repeat(100000);
await embeddings.embedQuery(veryLongText); // Will fail!
// ✅ Chunk long texts first
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 8000, // OpenAI limit is ~8191 tokens
});
const chunks = await splitter.splitText(veryLongText);
const embeddings = await embeddings.embedDocuments(chunks);Fix: Split long texts into chunks before embedding. Most models have 8k token limits.
5. Local Model Setup
// ❌ Ollama not running
import { OllamaEmbeddings } from "@langchain/ollama";
const embeddings = new OllamaEmbeddings({ model: "nomic-embed-text" });
await embeddings.embedQuery("test"); // Connection error!
// ✅ Ensure Ollama is running and model is pulled
// Terminal:
// ollama pull nomic-embed-text
// ollama serve
const embeddings = new OllamaEmbeddings({ model: "nomic-embed-text" });
await embeddings.embedQuery("test"); // Works!Fix: For local models, ensure the service is running and model is downloaded.
6. Azure Configuration Complexity
// ❌ Missing required fields
const embeddings = new AzureOpenAIEmbeddings({
azureOpenAIApiKey: process.env.AZURE_OPENAI_API_KEY,
});
// ✅ All required fields
const embeddings = new AzureOpenAIEmbeddings({
azureOpenAIApiKey: process.env.AZURE_OPENAI_API_KEY,
azureOpenAIApiInstanceName: "my-instance",
azureOpenAIApiEmbeddingsDeploymentName: "text-embedding-ada-002",
azureOpenAIApiVersion: "2024-02-01",
});Fix: Azure requires instance name, deployment name, and API version.
7. Dimension Mismatch in Vector Stores
// ❌ Vector store expecting 1536 dimensions, model produces 512
const embeddings = new OpenAIEmbeddings({
modelName: "text-embedding-3-small",
dimensions: 512,
});
// Vector store created with default 1536 dimensions
const vectorStore = await MemoryVectorStore.fromTexts(
["text1"],
embeddings, // Mismatch!
);
// ✅ Consistent dimensions
const embeddings = new OpenAIEmbeddings({
modelName: "text-embedding-3-small",
// Don't override dimensions, or ensure vector store matches
});Fix: Ensure vector store and embeddings use compatible dimensions.
Links and Resources
Official Documentation
- LangChain JS Embeddings Overview
- OpenAI Embeddings
- Azure OpenAI Embeddings
- HuggingFace Embeddings
- Ollama Embeddings
Provider Documentation
Package Installation
# OpenAI
npm install @langchain/openai
# Cohere
npm install @langchain/cohere
# Ollama
npm install @langchain/ollama
# Community (HuggingFace, etc.)
npm install @langchain/community