Skip to content

🧠 Vector Databases β€” Complete Guide ​

Storing and searching meaning, not just data.

NOTE

Prerequisite: This guide assumes you understand what a vector embedding is. If not, read the Vector Embeddings guide β†’ first.


What is a Vector Database? ​

A vector database is a specialized database designed to store, index, and search high-dimensional vectors (arrays of numbers) efficiently. Unlike traditional databases that match records by exact values, vector databases find records by semantic similarity β€” how close two vectors are in mathematical space.

The Core Idea ​

Every piece of data (text, image, audio, video) can be converted into a vector embedding β€” a list of numbers that captures its meaning or features. Vectors that are semantically similar end up close together in high-dimensional space.

text
"I love dogs"   β†’ [0.21, 0.87, 0.43, 0.11, ...]  ─┐
"I adore dogs"  β†’ [0.22, 0.85, 0.44, 0.10, ...]  β”€β”˜ (very close = similar)
"Stock market"  β†’ [0.91, 0.03, 0.72, 0.67, ...]     (far away = different)

🧩 Key Concepts ​

1. Embeddings ​

An embedding is a numerical representation of data in a high-dimensional space, produced by a machine learning model (e.g., OpenAI text-embedding-ada-002, Google text-embedding-gecko).

javascript
// Example: Getting an embedding from OpenAI
const { OpenAI } = require("openai");
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

async function getEmbedding(text) {
  const response = await openai.embeddings.create({
    model: "text-embedding-ada-002",
    input: text,
  });
  return response.data[0].embedding; // Returns array of 1536 floats
}

const vector = await getEmbedding("What is machine learning?");
// vector = [0.0023, -0.0107, 0.0412, ... ] (1536 numbers)

2. Similarity Search (ANN) ​

Vector DBs use Approximate Nearest Neighbor (ANN) algorithms to find the closest vectors quickly without scanning every record.

Common distance metrics:

MetricFormulaBest For
Cosine Similaritycos(ΞΈ) between vectorsText, NLP
Euclidean Distance√(Σ(a-b)²)Images, spatial
Dot Producta Β· bRecommendation
Manhattan DistanceΞ£|a-b|Sparse data

TIP

Choosing the wrong metric can ruin your search accuracy. Read the Similarity Metrics Deep Dive β†’ for a full comparison.

3. Indexes (HNSW, IVF, FLAT) ​

Index TypeSpeedAccuracyMemory
FLATSlow (brute force)100% exactLow
IVFFast~95%Medium
HNSWVery Fast~98%High
LSHFast~90%Low

TIP

Want to understand exactly how HNSW layers, M, ef_construction, and ef_search work? Read the ANN & HNSW Index deep dive β†’


πŸ—οΈ Architecture Diagram ​


πŸ”„ How It Works β€” Step by Step ​


Scenario ​

Build a semantic search engine for a product catalog. Users type natural language queries like "comfortable shoes for rainy weather" and get relevant results β€” even if product descriptions don't contain those exact words.

Step 1 β€” Index products ​

javascript
const { Pinecone } = require("@pinecone-database/pinecone");
const { OpenAI } = require("openai");

const pinecone = new Pinecone({ apiKey: process.env.PINECONE_API_KEY });
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const index = pinecone.index("products");

const products = [
  { id: "p1", name: "Waterproof Hiking Boot", category: "footwear" },
  { id: "p2", name: "Slim Fit Chino Trousers", category: "clothing" },
  { id: "p3", name: "All-Weather Running Shoe", category: "footwear" },
];

async function indexProducts() {
  const vectors = [];

  for (const product of products) {
    const response = await openai.embeddings.create({
      model: "text-embedding-ada-002",
      input: product.name,
    });

    vectors.push({
      id: product.id,
      values: response.data[0].embedding, // 1536-dim vector
      metadata: { name: product.name, category: product.category },
    });
  }

  await index.upsert(vectors);
  console.log("βœ… Products indexed!");
}
javascript
async function semanticSearch(userQuery, topK = 3) {
  // Convert query to vector
  const response = await openai.embeddings.create({
    model: "text-embedding-ada-002",
    input: userQuery,
  });
  const queryVector = response.data[0].embedding;

  // Search vector DB
  const results = await index.query({
    vector: queryVector,
    topK,
    includeMetadata: true,
  });

  return results.matches.map((match) => ({
    product: match.metadata.name,
    score: match.score.toFixed(4), // cosine similarity 0-1
  }));
}

// Usage
const results = await semanticSearch("comfortable shoes for rainy weather");
console.log(results);
// [
//   { product: "All-Weather Running Shoe", score: "0.9123" },
//   { product: "Waterproof Hiking Boot",   score: "0.8871" },
//   { product: "Slim Fit Chino Trousers",  score: "0.4203" },
// ]

🧠 RAG β€” Retrieval Augmented Generation ​

The most common production use case: grounding LLM answers in your own data.

TIP

Want to see a complete working implementation and advanced techniques like Hybrid Search and Reranking? Read the RAG Pattern deep dive β†’

javascript
async function ragAnswer(userQuestion) {
  // 1. Embed question
  const qVec = await getEmbedding(userQuestion);

  // 2. Retrieve context from vector DB
  const results = await index.query({
    vector: qVec,
    topK: 3,
    includeMetadata: true,
  });
  const context = results.matches.map((m) => m.metadata.text).join("\n\n");

  // 3. Send to LLM with context
  const chat = await openai.chat.completions.create({
    model: "gpt-4o",
    messages: [
      { role: "system", content: "Answer using ONLY the provided context." },
      {
        role: "user",
        content: `Context:\n${context}\n\nQuestion: ${userQuestion}`,
      },
    ],
  });

  return chat.choices[0].message.content;
}

πŸ—ΊοΈ When to Use a Vector Database ​

βœ… Use Vector DB When: ​

Use CaseExample
Semantic Search"Find docs similar to this question"
AI Chatbot MemoryRAG-based LLM apps
Recommendation Engine"Products similar to this one"
Image SimilarityReverse image search
Anomaly DetectionFraud detection by feature distance
Duplicate DetectionFind near-duplicate documents
Multimodal SearchSearch images with text queries

❌ Don't Use Vector DB When: ​

  • You need exact lookups (use SQL/Redis)
  • You need complex relational joins (use PostgreSQL)
  • You need keyword/boolean text search (use Elasticsearch)
  • Data is purely structured (use a relational DB)

DatabaseTypeBest ForFree Tier
PineconeManaged CloudProduction RAG, fast setupβœ… Yes
WeaviateOpen Source / CloudMulti-modal, GraphQL APIβœ… Yes
QdrantOpen Source / CloudRust-based, high perfβœ… Yes
ChromaDBOpen SourceLocal dev, prototypingβœ… Yes
MilvusOpen SourceBillion-scale vectorsβœ… Yes
pgvectorPostgreSQL ExtensionExisting Postgres usersβœ… Yes
Redis VSSRedis ExtensionLow-latency caching + searchβœ… Yes

⚑ Scaling a Vector Database ​

Sharding Strategy for Vector DBs ​


πŸ” Security & Best Practices ​


πŸ’» Full Working Example β€” Local RAG with ChromaDB ​

javascript
// npm install chromadb openai
const { ChromaClient } = require("chromadb");
const { OpenAI } = require("openai");

const chroma = new ChromaClient(); // runs locally
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// ── Helper: embed text ──────────────────────────────────
async function embed(text) {
  const res = await openai.embeddings.create({
    model: "text-embedding-ada-002",
    input: text,
  });
  return res.data[0].embedding;
}

// ── Step 1: Create collection & add documents ───────────
async function setup() {
  const col = await chroma.createCollection({ name: "knowledge_base" });

  const docs = [
    "Our return policy allows 30-day returns for unused items.",
    "Shipping is free for orders over $50 within the US.",
    "We support Visa, Mastercard, PayPal, and Apple Pay.",
  ];

  const embeddings = await Promise.all(docs.map(embed));

  await col.add({
    ids: ["doc1", "doc2", "doc3"],
    embeddings,
    documents: docs,
  });

  console.log("βœ… Knowledge base ready");
}

// ── Step 2: Answer questions using RAG ──────────────────
async function ask(question) {
  const col = await chroma.getCollection({ name: "knowledge_base" });
  const qVec = await embed(question);

  const results = await col.query({
    queryEmbeddings: [qVec],
    nResults: 2,
  });

  const context = results.documents[0].join("\n");

  const chat = await openai.chat.completions.create({
    model: "gpt-4o-mini",
    messages: [
      {
        role: "system",
        content: "Answer based only on this context:\n" + context,
      },
      { role: "user", content: question },
    ],
  });

  return chat.choices[0].message.content;
}

// ── Run ─────────────────────────────────────────────────
(async () => {
  await setup();
  console.log(await ask("How long can I return something?"));
  // β†’ "You can return unused items within 30 days."
})();

πŸ†š Vector DB vs Traditional DB β€” Side by Side ​

DimensionRelational DB (SQL)ElasticsearchVector DB
Query TypeExact / rangeKeyword / fuzzySemantic similarity
Data TypeStructured rowsText documentsAny (text, image, audio)
How it FindsB-Tree indexInverted indexANN index (HNSW/IVF)
Example QueryWHERE price < 100MATCH "blue shoes"NEAR [0.21, 0.87, ...]
StrengthJoins, transactionsFull-text, filtersAI-powered similarity
WeaknessNo semantic searchNeeds exact termsNo relational joins

βœ… Checklist Before Moving On ​

  • [ ] I understand what a vector embedding is
  • [ ] I know the difference between cosine, euclidean, and dot product similarity
  • [ ] I can explain what ANN / HNSW index means
  • [ ] I know the RAG pattern and why it's used
  • [ ] I can pick the right vector DB for a given scale
  • [ ] I understand when NOT to use a vector DB

πŸ“š Further Reading ​


➑️ Next: Level 4 β€” Caching

Released under the ISC License.