langchain-embeddings

Guide to using embedding model integrations in LangChain including OpenAI, Azure, and local embeddings

christian-bromann 3 1 Updated 5mo ago

GitHub

Install

npx skillscat add christian-bromann/langchain-skills/langchain-embeddings

Install via the SkillsCat registry.

SKILL.md

langchain-embeddings (JavaScript/TypeScript)

Overview

Embedding models convert text into numerical vector representations that capture semantic meaning. These vectors enable semantic search, similarity comparison, and are essential for building RAG (Retrieval-Augmented Generation) systems with vector databases.

Key Concepts

Embeddings: Dense vector representations of text that encode semantic meaning
Vector Dimensions: Different models produce vectors of different sizes (e.g., 1536 for OpenAI, 768 for some open-source models)
Similarity Search: Finding similar texts by comparing vector distances (cosine similarity, euclidean distance)
Batch Processing: Efficiently embedding multiple texts at once
Use Cases: Semantic search, document retrieval, clustering, recommendation systems

Provider Selection Decision Table

Provider	Best For	Model Examples	Dimensions	Package	Key Features
OpenAI	General purpose, high quality	text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002	1536, 3072	`@langchain/openai`	High quality, reliable, flexible dimensions
Azure OpenAI	Enterprise, compliance	text-embedding-ada-002 (Azure)	1536	`@langchain/openai`	Enterprise SLAs, data residency
Cohere	Multilingual, search optimization	embed-english-v3.0, embed-multilingual-v3.0	1024	`@langchain/cohere`	Search/clustering modes, multilingual
HuggingFace	Open source, customizable	all-MiniLM-L6-v2, BGE models	Varies	`@langchain/community`	Free, local inference, many models
Google	GCP integration	textembedding-gecko	768	`@langchain/google-genai`	GCP ecosystem, multimodal
Ollama	Local, privacy	llama2, mistral, nomic-embed-text	Varies	`@langchain/ollama`	Fully local, no API costs, privacy

When to Choose Each Provider

Choose OpenAI if:

You need high-quality embeddings for production
You want reliable, fast API-based embeddings
Cost is reasonable for your use case (~$0.13 per 1M tokens)

Choose Azure OpenAI if:

You need enterprise support and SLAs
Data compliance/residency is critical
You're already using Azure infrastructure

Choose Cohere if:

You need multilingual embeddings
You want optimized embeddings for search vs. clustering
You need competitive pricing

Choose HuggingFace if:

You want to use open-source models
You need specific model characteristics
You want to run inference locally or on your own infrastructure

Choose Ollama if:

Privacy is paramount (fully local)
You want zero API costs after setup
You have sufficient local compute resources

Code Examples

OpenAI Embeddings

import { OpenAIEmbeddings } from "@langchain/openai";

// Basic initialization
const embeddings = new OpenAIEmbeddings({
  modelName: "text-embedding-3-small",
  openAIApiKey: process.env.OPENAI_API_KEY, // Optional if set in env
});

// Embed a single query
const queryEmbedding = await embeddings.embedQuery(
  "What is the capital of France?"
);
console.log(`Vector dimensions: ${queryEmbedding.length}`);
console.log(`First few values: ${queryEmbedding.slice(0, 5)}`);

// Embed multiple documents
const documents = [
  "Paris is the capital of France.",
  "London is the capital of England.",
  "Berlin is the capital of Germany.",
];
const docEmbeddings = await embeddings.embedDocuments(documents);
console.log(`Embedded ${docEmbeddings.length} documents`);

// Using newer models with custom dimensions
const smallEmbeddings = new OpenAIEmbeddings({
  modelName: "text-embedding-3-small",
  dimensions: 512, // Reduce from default 1536 for efficiency
});

Azure OpenAI Embeddings

import { AzureOpenAIEmbeddings } from "@langchain/openai";

const embeddings = new AzureOpenAIEmbeddings({
  azureOpenAIApiKey: process.env.AZURE_OPENAI_API_KEY,
  azureOpenAIApiInstanceName: process.env.AZURE_OPENAI_API_INSTANCE_NAME,
  azureOpenAIApiEmbeddingsDeploymentName: "text-embedding-ada-002",
  azureOpenAIApiVersion: "2024-02-01",
});

const embedding = await embeddings.embedQuery("Hello world");

HuggingFace Embeddings (Local)

import { HuggingFaceTransformersEmbeddings } from "@langchain/community/embeddings/hf_transformers";

// Run embeddings locally with Transformers.js
const embeddings = new HuggingFaceTransformersEmbeddings({
  modelName: "Xenova/all-MiniLM-L6-v2",
});

const embedding = await embeddings.embedQuery("This runs locally!");

Ollama Embeddings (Local)

import { OllamaEmbeddings } from "@langchain/ollama";

// Requires Ollama running locally: ollama pull nomic-embed-text
const embeddings = new OllamaEmbeddings({
  model: "nomic-embed-text",
  baseUrl: "http://localhost:11434", // Default Ollama URL
});

const embedding = await embeddings.embedQuery("Fully local embeddings");

Cohere Embeddings

import { CohereEmbeddings } from "@langchain/cohere";

const embeddings = new CohereEmbeddings({
  apiKey: process.env.COHERE_API_KEY,
  model: "embed-english-v3.0",
  inputType: "search_query", // or "search_document", "classification", "clustering"
});

const queryEmbedding = await embeddings.embedQuery("Search query");
const docEmbeddings = await embeddings.embedDocuments(["doc1", "doc2"]);

Computing Similarity

import { OpenAIEmbeddings } from "@langchain/openai";

const embeddings = new OpenAIEmbeddings();

// Embed query and documents
const query = "What is machine learning?";
const docs = [
  "Machine learning is a branch of AI",
  "Paris is the capital of France",
  "Neural networks are used in deep learning",
];

const queryVec = await embeddings.embedQuery(query);
const docVecs = await embeddings.embedDocuments(docs);

// Compute cosine similarity
function cosineSimilarity(vecA: number[], vecB: number[]): number {
  const dotProduct = vecA.reduce((sum, a, i) => sum + a * vecB[i], 0);
  const magnitudeA = Math.sqrt(vecA.reduce((sum, a) => sum + a * a, 0));
  const magnitudeB = Math.sqrt(vecB.reduce((sum, b) => sum + b * b, 0));
  return dotProduct / (magnitudeA * magnitudeB);
}

// Find most similar document
const similarities = docVecs.map((docVec) => 
  cosineSimilarity(queryVec, docVec)
);
console.log("Similarities:", similarities);
const mostSimilarIdx = similarities.indexOf(Math.max(...similarities));
console.log("Most similar doc:", docs[mostSimilarIdx]);

Batch Processing for Efficiency

import { OpenAIEmbeddings } from "@langchain/openai";

const embeddings = new OpenAIEmbeddings({
  batchSize: 512, // OpenAI allows up to 2048 in one request
});

// Efficiently embed large document sets
const largeDocSet = Array.from({ length: 1000 }, (_, i) => 
  `Document ${i}: Some content here`
);

const docEmbeddings = await embeddings.embedDocuments(largeDocSet);
console.log(`Embedded ${docEmbeddings.length} documents in batches`);

Boundaries

What Agents CAN Do

✅ Initialize embedding models

Set up OpenAI, Azure, Cohere, HuggingFace, or Ollama embeddings
Configure API keys and model parameters

✅ Embed text content

Embed single queries with embedQuery()
Embed multiple documents with embedDocuments()
Process large batches efficiently

✅ Use embeddings with vector stores

Pass embeddings to vector store constructors
Enable semantic search capabilities

✅ Choose appropriate models

Select based on quality, cost, latency requirements
Use local models for privacy concerns

✅ Optimize for use case

Adjust batch sizes for efficiency
Use smaller dimensions to reduce costs/storage

What Agents CANNOT Do

❌ Modify embedding dimensions arbitrarily

Cannot change dimensions beyond what the model supports
text-embedding-3-* models support custom dimensions, older models don't

❌ Mix embeddings from different models

Cannot compare embeddings from different models directly
Must use same model for all embeddings in a similarity search

❌ Exceed API rate limits

Cannot bypass provider rate limits
Must implement rate limiting for large-scale operations

❌ Generate embeddings without proper authentication

Cannot use cloud providers without valid API keys
Cannot access models without proper credentials

Gotchas

1. Model Consistency is Critical

// ❌ BAD: Using different models
const embeddings1 = new OpenAIEmbeddings({ 
  modelName: "text-embedding-3-small" 
});
const embeddings2 = new OpenAIEmbeddings({ 
  modelName: "text-embedding-ada-002" 
});

const queryVec = await embeddings1.embedQuery("query");
const docVec = await embeddings2.embedQuery("document");
// Similarity comparison will be meaningless!

// ✅ GOOD: Use same model for everything
const embeddings = new OpenAIEmbeddings({ 
  modelName: "text-embedding-3-small" 
});
const queryVec = await embeddings.embedQuery("query");
const docVec = await embeddings.embedQuery("document");
// Now similarity makes sense

Fix: Always use the same embedding model for all texts you want to compare.

2. Batch Size Limits

// ❌ Potential API error with too many docs
const embeddings = new OpenAIEmbeddings();
const hugeDocs = Array(5000).fill("text");
await embeddings.embedDocuments(hugeDocs); // May fail!

// ✅ Configure appropriate batch size
const embeddings = new OpenAIEmbeddings({
  batchSize: 512, // OpenAI limit is 2048, use smaller for safety
});
await embeddings.embedDocuments(hugeDocs); // Handles batching automatically

Fix: Set appropriate batchSize parameter for the provider.

3. API Keys in Environment

// ❌ Hardcoded API key
const embeddings = new OpenAIEmbeddings({
  openAIApiKey: "sk-...", // Never commit this!
});

// ✅ Use environment variables
const embeddings = new OpenAIEmbeddings({
  openAIApiKey: process.env.OPENAI_API_KEY,
});

// ✅ Even better: auto-detection
const embeddings = new OpenAIEmbeddings(); 
// Reads OPENAI_API_KEY from environment automatically

Fix: Use environment variables for API keys.

4. Text Length Limits

// ❌ Text too long
const embeddings = new OpenAIEmbeddings();
const veryLongText = "...".repeat(100000);
await embeddings.embedQuery(veryLongText); // Will fail!

// ✅ Chunk long texts first
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";

const splitter = new RecursiveCharacterTextSplitter({
  chunkSize: 8000, // OpenAI limit is ~8191 tokens
});
const chunks = await splitter.splitText(veryLongText);
const embeddings = await embeddings.embedDocuments(chunks);

Fix: Split long texts into chunks before embedding. Most models have 8k token limits.

5. Local Model Setup

// ❌ Ollama not running
import { OllamaEmbeddings } from "@langchain/ollama";
const embeddings = new OllamaEmbeddings({ model: "nomic-embed-text" });
await embeddings.embedQuery("test"); // Connection error!

// ✅ Ensure Ollama is running and model is pulled
// Terminal:
// ollama pull nomic-embed-text
// ollama serve

const embeddings = new OllamaEmbeddings({ model: "nomic-embed-text" });
await embeddings.embedQuery("test"); // Works!

Fix: For local models, ensure the service is running and model is downloaded.

6. Azure Configuration Complexity

// ❌ Missing required fields
const embeddings = new AzureOpenAIEmbeddings({
  azureOpenAIApiKey: process.env.AZURE_OPENAI_API_KEY,
});

// ✅ All required fields
const embeddings = new AzureOpenAIEmbeddings({
  azureOpenAIApiKey: process.env.AZURE_OPENAI_API_KEY,
  azureOpenAIApiInstanceName: "my-instance",
  azureOpenAIApiEmbeddingsDeploymentName: "text-embedding-ada-002",
  azureOpenAIApiVersion: "2024-02-01",
});

Fix: Azure requires instance name, deployment name, and API version.

7. Dimension Mismatch in Vector Stores

// ❌ Vector store expecting 1536 dimensions, model produces 512
const embeddings = new OpenAIEmbeddings({
  modelName: "text-embedding-3-small",
  dimensions: 512,
});

// Vector store created with default 1536 dimensions
const vectorStore = await MemoryVectorStore.fromTexts(
  ["text1"],
  embeddings, // Mismatch!
);

// ✅ Consistent dimensions
const embeddings = new OpenAIEmbeddings({
  modelName: "text-embedding-3-small",
  // Don't override dimensions, or ensure vector store matches
});

Fix: Ensure vector store and embeddings use compatible dimensions.

Links and Resources

Official Documentation

Provider Documentation

Package Installation

# OpenAI
npm install @langchain/openai

# Cohere
npm install @langchain/cohere

# Ollama
npm install @langchain/ollama

# Community (HuggingFace, etc.)
npm install @langchain/community

langchain-embeddings

Install

langchain-embeddings (JavaScript/TypeScript)

Overview

Key Concepts

Provider Selection Decision Table

When to Choose Each Provider

Code Examples

OpenAI Embeddings

Azure OpenAI Embeddings

HuggingFace Embeddings (Local)

Ollama Embeddings (Local)

Cohere Embeddings

Computing Similarity

Batch Processing for Efficiency

Boundaries

What Agents CAN Do

What Agents CANNOT Do

Gotchas

1. Model Consistency is Critical

2. Batch Size Limits

3. API Keys in Environment

4. Text Length Limits

5. Local Model Setup

6. Azure Configuration Complexity

7. Dimension Mismatch in Vector Stores

Links and Resources

Official Documentation

Provider Documentation

Package Installation

Categories

Install

Recommended Skills