🧠 Vector Databases — Complete Guide

Storing and searching meaning, not just data.

NOTE

Prerequisite: This guide assumes you understand what a vector embedding is. If not, read the Vector Embeddings guide → first.

What is a Vector Database?

A vector database is a specialized database designed to store, index, and search high-dimensional vectors (arrays of numbers) efficiently. Unlike traditional databases that match records by exact values, vector databases find records by semantic similarity — how close two vectors are in mathematical space.

The Core Idea

Every piece of data (text, image, audio, video) can be converted into a vector embedding — a list of numbers that captures its meaning or features. Vectors that are semantically similar end up close together in high-dimensional space.

text

"I love dogs"   → [0.21, 0.87, 0.43, 0.11, ...]  ─┐
"I adore dogs"  → [0.22, 0.85, 0.44, 0.10, ...]  ─┘ (very close = similar)
"Stock market"  → [0.91, 0.03, 0.72, 0.67, ...]     (far away = different)

🧩 Key Concepts

1. Embeddings

An embedding is a numerical representation of data in a high-dimensional space, produced by a machine learning model (e.g., OpenAI text-embedding-ada-002, Google text-embedding-gecko).

javascript

// Example: Getting an embedding from OpenAI
const { OpenAI } = require("openai");
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

async function getEmbedding(text) {
  const response = await openai.embeddings.create({
    model: "text-embedding-ada-002",
    input: text,
  });
  return response.data[0].embedding; // Returns array of 1536 floats
}

const vector = await getEmbedding("What is machine learning?");
// vector = [0.0023, -0.0107, 0.0412, ... ] (1536 numbers)

2. Similarity Search (ANN)

Vector DBs use Approximate Nearest Neighbor (ANN) algorithms to find the closest vectors quickly without scanning every record.

Common distance metrics:

Metric	Formula	Best For
Cosine Similarity	cos(θ) between vectors	Text, NLP
Euclidean Distance	√(Σ(a-b)²)	Images, spatial
Dot Product	a · b	Recommendation
Manhattan Distance	Σ\|a-b\|	Sparse data

TIP

Choosing the wrong metric can ruin your search accuracy. Read the Similarity Metrics Deep Dive → for a full comparison.

3. Indexes (HNSW, IVF, FLAT)

Index Type	Speed	Accuracy	Memory
FLAT	Slow (brute force)	100% exact	Low
IVF	Fast	~95%	Medium
HNSW	Very Fast	~98%	High
LSH	Fast	~90%	Low

TIP

Want to understand exactly how HNSW layers, M, ef_construction, and ef_search work? Read the ANN & HNSW Index deep dive →

🏗️ Architecture Diagram

🔄 How It Works — Step by Step

📦 Real-World Example: AI Semantic Search

Scenario

Build a semantic search engine for a product catalog. Users type natural language queries like "comfortable shoes for rainy weather" and get relevant results — even if product descriptions don't contain those exact words.

Step 1 — Index products

javascript

const { Pinecone } = require("@pinecone-database/pinecone");
const { OpenAI } = require("openai");

const pinecone = new Pinecone({ apiKey: process.env.PINECONE_API_KEY });
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const index = pinecone.index("products");

const products = [
  { id: "p1", name: "Waterproof Hiking Boot", category: "footwear" },
  { id: "p2", name: "Slim Fit Chino Trousers", category: "clothing" },
  { id: "p3", name: "All-Weather Running Shoe", category: "footwear" },
];

async function indexProducts() {
  const vectors = [];

  for (const product of products) {
    const response = await openai.embeddings.create({
      model: "text-embedding-ada-002",
      input: product.name,
    });

    vectors.push({
      id: product.id,
      values: response.data[0].embedding, // 1536-dim vector
      metadata: { name: product.name, category: product.category },
    });
  }

  await index.upsert(vectors);
  console.log("✅ Products indexed!");
}

Step 2 — Semantic Search

javascript

async function semanticSearch(userQuery, topK = 3) {
  // Convert query to vector
  const response = await openai.embeddings.create({
    model: "text-embedding-ada-002",
    input: userQuery,
  });
  const queryVector = response.data[0].embedding;

  // Search vector DB
  const results = await index.query({
    vector: queryVector,
    topK,
    includeMetadata: true,
  });

  return results.matches.map((match) => ({
    product: match.metadata.name,
    score: match.score.toFixed(4), // cosine similarity 0-1
  }));
}

// Usage
const results = await semanticSearch("comfortable shoes for rainy weather");
console.log(results);
// [
//   { product: "All-Weather Running Shoe", score: "0.9123" },
//   { product: "Waterproof Hiking Boot",   score: "0.8871" },
//   { product: "Slim Fit Chino Trousers",  score: "0.4203" },
// ]

🧠 RAG — Retrieval Augmented Generation

The most common production use case: grounding LLM answers in your own data.

TIP

Want to see a complete working implementation and advanced techniques like Hybrid Search and Reranking? Read the RAG Pattern deep dive →

javascript

async function ragAnswer(userQuestion) {
  // 1. Embed question
  const qVec = await getEmbedding(userQuestion);

  // 2. Retrieve context from vector DB
  const results = await index.query({
    vector: qVec,
    topK: 3,
    includeMetadata: true,
  });
  const context = results.matches.map((m) => m.metadata.text).join("\n\n");

  // 3. Send to LLM with context
  const chat = await openai.chat.completions.create({
    model: "gpt-4o",
    messages: [
      { role: "system", content: "Answer using ONLY the provided context." },
      {
        role: "user",
        content: `Context:\n${context}\n\nQuestion: ${userQuestion}`,
      },
    ],
  });

  return chat.choices[0].message.content;
}

🗺️ When to Use a Vector Database

✅ Use Vector DB When:

Use Case	Example
Semantic Search	"Find docs similar to this question"
AI Chatbot Memory	RAG-based LLM apps
Recommendation Engine	"Products similar to this one"
Image Similarity	Reverse image search
Anomaly Detection	Fraud detection by feature distance
Duplicate Detection	Find near-duplicate documents
Multimodal Search	Search images with text queries

❌ Don't Use Vector DB When:

You need exact lookups (use SQL/Redis)
You need complex relational joins (use PostgreSQL)
You need keyword/boolean text search (use Elasticsearch)
Data is purely structured (use a relational DB)

🏆 Popular Vector Databases Compared

Database	Type	Best For	Free Tier
Pinecone	Managed Cloud	Production RAG, fast setup	✅ Yes
Weaviate	Open Source / Cloud	Multi-modal, GraphQL API	✅ Yes
Qdrant	Open Source / Cloud	Rust-based, high perf	✅ Yes
ChromaDB	Open Source	Local dev, prototyping	✅ Yes
Milvus	Open Source	Billion-scale vectors	✅ Yes
pgvector	PostgreSQL Extension	Existing Postgres users	✅ Yes
Redis VSS	Redis Extension	Low-latency caching + search	✅ Yes

⚡ Scaling a Vector Database

Sharding Strategy for Vector DBs

🔐 Security & Best Practices

💻 Full Working Example — Local RAG with ChromaDB

javascript

// npm install chromadb openai
const { ChromaClient } = require("chromadb");
const { OpenAI } = require("openai");

const chroma = new ChromaClient(); // runs locally
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// ── Helper: embed text ──────────────────────────────────
async function embed(text) {
  const res = await openai.embeddings.create({
    model: "text-embedding-ada-002",
    input: text,
  });
  return res.data[0].embedding;
}

// ── Step 1: Create collection & add documents ───────────
async function setup() {
  const col = await chroma.createCollection({ name: "knowledge_base" });

  const docs = [
    "Our return policy allows 30-day returns for unused items.",
    "Shipping is free for orders over $50 within the US.",
    "We support Visa, Mastercard, PayPal, and Apple Pay.",
  ];

  const embeddings = await Promise.all(docs.map(embed));

  await col.add({
    ids: ["doc1", "doc2", "doc3"],
    embeddings,
    documents: docs,
  });

  console.log("✅ Knowledge base ready");
}

// ── Step 2: Answer questions using RAG ──────────────────
async function ask(question) {
  const col = await chroma.getCollection({ name: "knowledge_base" });
  const qVec = await embed(question);

  const results = await col.query({
    queryEmbeddings: [qVec],
    nResults: 2,
  });

  const context = results.documents[0].join("\n");

  const chat = await openai.chat.completions.create({
    model: "gpt-4o-mini",
    messages: [
      {
        role: "system",
        content: "Answer based only on this context:\n" + context,
      },
      { role: "user", content: question },
    ],
  });

  return chat.choices[0].message.content;
}

// ── Run ─────────────────────────────────────────────────
(async () => {
  await setup();
  console.log(await ask("How long can I return something?"));
  // → "You can return unused items within 30 days."
})();

🆚 Vector DB vs Traditional DB — Side by Side

Dimension	Relational DB (SQL)	Elasticsearch	Vector DB
Query Type	Exact / range	Keyword / fuzzy	Semantic similarity
Data Type	Structured rows	Text documents	Any (text, image, audio)
How it Finds	B-Tree index	Inverted index	ANN index (HNSW/IVF)
Example Query	`WHERE price < 100`	`MATCH "blue shoes"`	`NEAR [0.21, 0.87, ...]`
Strength	Joins, transactions	Full-text, filters	AI-powered similarity
Weakness	No semantic search	Needs exact terms	No relational joins

✅ Checklist Before Moving On

[ ] I understand what a vector embedding is
[ ] I know the difference between cosine, euclidean, and dot product similarity
[ ] I can explain what ANN / HNSW index means
[ ] I know the RAG pattern and why it's used
[ ] I can pick the right vector DB for a given scale
[ ] I understand when NOT to use a vector DB

📚 Further Reading

Pinecone Learn — Best beginner-friendly resource
Weaviate Docs — Open-source deep dives
LangChain Vector Stores — Framework integrations
pgvector GitHub — SQL users

➡️ Next: Level 4 — Caching

🧠 Vector Databases — Complete Guide ​

What is a Vector Database? ​

The Core Idea ​

🧩 Key Concepts ​

1. Embeddings ​

2. Similarity Search (ANN) ​

3. Indexes (HNSW, IVF, FLAT) ​

🏗️ Architecture Diagram ​

🔄 How It Works — Step by Step ​

📦 Real-World Example: AI Semantic Search ​

Scenario ​

Step 1 — Index products ​

Step 2 — Semantic Search ​

🧠 RAG — Retrieval Augmented Generation ​

🗺️ When to Use a Vector Database ​

✅ Use Vector DB When: ​

❌ Don't Use Vector DB When: ​

🏆 Popular Vector Databases Compared ​

⚡ Scaling a Vector Database ​

Sharding Strategy for Vector DBs ​

🔐 Security & Best Practices ​

💻 Full Working Example — Local RAG with ChromaDB ​

🆚 Vector DB vs Traditional DB — Side by Side ​

✅ Checklist Before Moving On ​

📚 Further Reading ​

🧠 Vector Databases — Complete Guide

What is a Vector Database?

The Core Idea

🧩 Key Concepts

1. Embeddings

2. Similarity Search (ANN)

3. Indexes (HNSW, IVF, FLAT)

🏗️ Architecture Diagram

🔄 How It Works — Step by Step

📦 Real-World Example: AI Semantic Search

Scenario

Step 1 — Index products

Step 2 — Semantic Search

🧠 RAG — Retrieval Augmented Generation

🗺️ When to Use a Vector Database

✅ Use Vector DB When:

❌ Don't Use Vector DB When:

🏆 Popular Vector Databases Compared

⚡ Scaling a Vector Database

Sharding Strategy for Vector DBs

🔐 Security & Best Practices

💻 Full Working Example — Local RAG with ChromaDB

🆚 Vector DB vs Traditional DB — Side by Side

✅ Checklist Before Moving On

📚 Further Reading