π§ Vector Databases β Complete Guide β
Storing and searching meaning, not just data.
NOTE
Prerequisite: This guide assumes you understand what a vector embedding is. If not, read the Vector Embeddings guide β first.
What is a Vector Database? β
A vector database is a specialized database designed to store, index, and search high-dimensional vectors (arrays of numbers) efficiently. Unlike traditional databases that match records by exact values, vector databases find records by semantic similarity β how close two vectors are in mathematical space.
The Core Idea β
Every piece of data (text, image, audio, video) can be converted into a vector embedding β a list of numbers that captures its meaning or features. Vectors that are semantically similar end up close together in high-dimensional space.
"I love dogs" β [0.21, 0.87, 0.43, 0.11, ...] ββ
"I adore dogs" β [0.22, 0.85, 0.44, 0.10, ...] ββ (very close = similar)
"Stock market" β [0.91, 0.03, 0.72, 0.67, ...] (far away = different)π§© Key Concepts β
1. Embeddings β
An embedding is a numerical representation of data in a high-dimensional space, produced by a machine learning model (e.g., OpenAI text-embedding-ada-002, Google text-embedding-gecko).
// Example: Getting an embedding from OpenAI
const { OpenAI } = require("openai");
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
async function getEmbedding(text) {
const response = await openai.embeddings.create({
model: "text-embedding-ada-002",
input: text,
});
return response.data[0].embedding; // Returns array of 1536 floats
}
const vector = await getEmbedding("What is machine learning?");
// vector = [0.0023, -0.0107, 0.0412, ... ] (1536 numbers)2. Similarity Search (ANN) β
Vector DBs use Approximate Nearest Neighbor (ANN) algorithms to find the closest vectors quickly without scanning every record.
Common distance metrics:
| Metric | Formula | Best For |
|---|---|---|
| Cosine Similarity | cos(ΞΈ) between vectors | Text, NLP |
| Euclidean Distance | β(Ξ£(a-b)Β²) | Images, spatial |
| Dot Product | a Β· b | Recommendation |
| Manhattan Distance | Ξ£|a-b| | Sparse data |
TIP
Choosing the wrong metric can ruin your search accuracy. Read the Similarity Metrics Deep Dive β for a full comparison.
3. Indexes (HNSW, IVF, FLAT) β
| Index Type | Speed | Accuracy | Memory |
|---|---|---|---|
| FLAT | Slow (brute force) | 100% exact | Low |
| IVF | Fast | ~95% | Medium |
| HNSW | Very Fast | ~98% | High |
| LSH | Fast | ~90% | Low |
TIP
Want to understand exactly how HNSW layers, M, ef_construction, and ef_search work? Read the ANN & HNSW Index deep dive β
ποΈ Architecture Diagram β
π How It Works β Step by Step β
π¦ Real-World Example: AI Semantic Search β
Scenario β
Build a semantic search engine for a product catalog. Users type natural language queries like "comfortable shoes for rainy weather" and get relevant results β even if product descriptions don't contain those exact words.
Step 1 β Index products β
const { Pinecone } = require("@pinecone-database/pinecone");
const { OpenAI } = require("openai");
const pinecone = new Pinecone({ apiKey: process.env.PINECONE_API_KEY });
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const index = pinecone.index("products");
const products = [
{ id: "p1", name: "Waterproof Hiking Boot", category: "footwear" },
{ id: "p2", name: "Slim Fit Chino Trousers", category: "clothing" },
{ id: "p3", name: "All-Weather Running Shoe", category: "footwear" },
];
async function indexProducts() {
const vectors = [];
for (const product of products) {
const response = await openai.embeddings.create({
model: "text-embedding-ada-002",
input: product.name,
});
vectors.push({
id: product.id,
values: response.data[0].embedding, // 1536-dim vector
metadata: { name: product.name, category: product.category },
});
}
await index.upsert(vectors);
console.log("β
Products indexed!");
}Step 2 β Semantic Search β
async function semanticSearch(userQuery, topK = 3) {
// Convert query to vector
const response = await openai.embeddings.create({
model: "text-embedding-ada-002",
input: userQuery,
});
const queryVector = response.data[0].embedding;
// Search vector DB
const results = await index.query({
vector: queryVector,
topK,
includeMetadata: true,
});
return results.matches.map((match) => ({
product: match.metadata.name,
score: match.score.toFixed(4), // cosine similarity 0-1
}));
}
// Usage
const results = await semanticSearch("comfortable shoes for rainy weather");
console.log(results);
// [
// { product: "All-Weather Running Shoe", score: "0.9123" },
// { product: "Waterproof Hiking Boot", score: "0.8871" },
// { product: "Slim Fit Chino Trousers", score: "0.4203" },
// ]π§ RAG β Retrieval Augmented Generation β
The most common production use case: grounding LLM answers in your own data.
TIP
Want to see a complete working implementation and advanced techniques like Hybrid Search and Reranking? Read the RAG Pattern deep dive β
async function ragAnswer(userQuestion) {
// 1. Embed question
const qVec = await getEmbedding(userQuestion);
// 2. Retrieve context from vector DB
const results = await index.query({
vector: qVec,
topK: 3,
includeMetadata: true,
});
const context = results.matches.map((m) => m.metadata.text).join("\n\n");
// 3. Send to LLM with context
const chat = await openai.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: "Answer using ONLY the provided context." },
{
role: "user",
content: `Context:\n${context}\n\nQuestion: ${userQuestion}`,
},
],
});
return chat.choices[0].message.content;
}πΊοΈ When to Use a Vector Database β
β Use Vector DB When: β
| Use Case | Example |
|---|---|
| Semantic Search | "Find docs similar to this question" |
| AI Chatbot Memory | RAG-based LLM apps |
| Recommendation Engine | "Products similar to this one" |
| Image Similarity | Reverse image search |
| Anomaly Detection | Fraud detection by feature distance |
| Duplicate Detection | Find near-duplicate documents |
| Multimodal Search | Search images with text queries |
β Don't Use Vector DB When: β
- You need exact lookups (use SQL/Redis)
- You need complex relational joins (use PostgreSQL)
- You need keyword/boolean text search (use Elasticsearch)
- Data is purely structured (use a relational DB)
π Popular Vector Databases Compared β
| Database | Type | Best For | Free Tier |
|---|---|---|---|
| Pinecone | Managed Cloud | Production RAG, fast setup | β Yes |
| Weaviate | Open Source / Cloud | Multi-modal, GraphQL API | β Yes |
| Qdrant | Open Source / Cloud | Rust-based, high perf | β Yes |
| ChromaDB | Open Source | Local dev, prototyping | β Yes |
| Milvus | Open Source | Billion-scale vectors | β Yes |
| pgvector | PostgreSQL Extension | Existing Postgres users | β Yes |
| Redis VSS | Redis Extension | Low-latency caching + search | β Yes |
β‘ Scaling a Vector Database β
Sharding Strategy for Vector DBs β
π Security & Best Practices β
π» Full Working Example β Local RAG with ChromaDB β
// npm install chromadb openai
const { ChromaClient } = require("chromadb");
const { OpenAI } = require("openai");
const chroma = new ChromaClient(); // runs locally
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
// ββ Helper: embed text ββββββββββββββββββββββββββββββββββ
async function embed(text) {
const res = await openai.embeddings.create({
model: "text-embedding-ada-002",
input: text,
});
return res.data[0].embedding;
}
// ββ Step 1: Create collection & add documents βββββββββββ
async function setup() {
const col = await chroma.createCollection({ name: "knowledge_base" });
const docs = [
"Our return policy allows 30-day returns for unused items.",
"Shipping is free for orders over $50 within the US.",
"We support Visa, Mastercard, PayPal, and Apple Pay.",
];
const embeddings = await Promise.all(docs.map(embed));
await col.add({
ids: ["doc1", "doc2", "doc3"],
embeddings,
documents: docs,
});
console.log("β
Knowledge base ready");
}
// ββ Step 2: Answer questions using RAG ββββββββββββββββββ
async function ask(question) {
const col = await chroma.getCollection({ name: "knowledge_base" });
const qVec = await embed(question);
const results = await col.query({
queryEmbeddings: [qVec],
nResults: 2,
});
const context = results.documents[0].join("\n");
const chat = await openai.chat.completions.create({
model: "gpt-4o-mini",
messages: [
{
role: "system",
content: "Answer based only on this context:\n" + context,
},
{ role: "user", content: question },
],
});
return chat.choices[0].message.content;
}
// ββ Run βββββββββββββββββββββββββββββββββββββββββββββββββ
(async () => {
await setup();
console.log(await ask("How long can I return something?"));
// β "You can return unused items within 30 days."
})();π Vector DB vs Traditional DB β Side by Side β
| Dimension | Relational DB (SQL) | Elasticsearch | Vector DB |
|---|---|---|---|
| Query Type | Exact / range | Keyword / fuzzy | Semantic similarity |
| Data Type | Structured rows | Text documents | Any (text, image, audio) |
| How it Finds | B-Tree index | Inverted index | ANN index (HNSW/IVF) |
| Example Query | WHERE price < 100 | MATCH "blue shoes" | NEAR [0.21, 0.87, ...] |
| Strength | Joins, transactions | Full-text, filters | AI-powered similarity |
| Weakness | No semantic search | Needs exact terms | No relational joins |
β Checklist Before Moving On β
- [ ] I understand what a vector embedding is
- [ ] I know the difference between cosine, euclidean, and dot product similarity
- [ ] I can explain what ANN / HNSW index means
- [ ] I know the RAG pattern and why it's used
- [ ] I can pick the right vector DB for a given scale
- [ ] I understand when NOT to use a vector DB
π Further Reading β
- Pinecone Learn β Best beginner-friendly resource
- Weaviate Docs β Open-source deep dives
- LangChain Vector Stores β Framework integrations
- pgvector GitHub β SQL users
β‘οΈ Next: Level 4 β Caching
