🔢 Vector Embeddings — Complete Guide

Turning meaning into math — the foundation of modern AI systems.

What is a Vector Embedding?

A vector embedding is a numerical representation of real-world data (text, image, audio, video, code) as a fixed-size array of floating-point numbers.

The key insight: similar things produce similar numbers.

text

Word: "King"   → [0.81, 0.22, 0.67, 0.14, ...]  ─┐
Word: "Queen"  → [0.79, 0.25, 0.65, 0.19, ...]  ─┘ nearby in space
Word: "Apple"  → [0.12, 0.91, 0.03, 0.88, ...]    far away

The model doesn't just store the word — it encodes its meaning, relationships, and context into a point in high-dimensional space.

🧠 The Intuition — From Words to Numbers

Step 1 — One-Hot Encoding (naive, broken)

The old way: represent each word as a sparse binary vector.

text

Vocabulary: [cat, dog, king, queen, apple]

"cat"   = [1, 0, 0, 0, 0]
"dog"   = [0, 1, 0, 0, 0]
"king"  = [0, 0, 1, 0, 0]
"queen" = [0, 0, 0, 1, 0]

Problems:

No relationship encoded — "king" and "queen" look as different as "cat" and "apple"
Scales to millions of dimensions (one per word) — unusable
No semantic similarity captured

Step 2 — Dense Embeddings (modern, powerful)

A neural network learns to compress meaning into a small dense vector:

text

"king"  → [0.81, 0.22, 0.67, 0.14, 0.55, ...]  (300-1536 numbers)
"queen" → [0.79, 0.25, 0.65, 0.19, 0.53, ...]  (very close!)
"apple" → [0.12, 0.91, 0.03, 0.88, 0.11, ...]  (far away)

The famous analogy holds: King − Man + Woman ≈ Queen

📐 Geometry of Embeddings

In reality, embeddings live in hundreds to thousands of dimensions — the geometry is the same, just much richer.

⚙️ How Embeddings Are Created

Popular Embedding Models

Model	Provider	Dimensions	Best For
`text-embedding-ada-002`	OpenAI	1536	General text, RAG
`text-embedding-3-small`	OpenAI	1536	Cost-efficient text
`text-embedding-3-large`	OpenAI	3072	High-accuracy text
`text-embedding-gecko`	Google	768	Multilingual text
`BERT` / `RoBERTa`	HuggingFace	768	Open-source NLP
`CLIP`	OpenAI	512	Text + Image (multimodal)
`Whisper` embeddings	OpenAI	1280	Audio → vector
`all-MiniLM-L6-v2`	SBERT	384	Fast, local, sentence similarity

📏 Similarity Metrics — How to Measure "Closeness"

Two vectors can be compared using different distance formulas depending on the use case.

1. Cosine Similarity

Measures the angle between two vectors. Ignores magnitude, cares about direction.

Range: -1 (opposite) → 0 (unrelated) → 1 (identical)

javascript

function cosineSimilarity(a, b) {
  const dot = a.reduce((sum, ai, i) => sum + ai * b[i], 0);
  const magA = Math.sqrt(a.reduce((sum, ai) => sum + ai * ai, 0));
  const magB = Math.sqrt(b.reduce((sum, bi) => sum + bi * bi, 0));
  return dot / (magA * magB);
}

const v1 = [0.21, 0.87, 0.43];
const v2 = [0.22, 0.85, 0.44];
const v3 = [0.91, 0.03, 0.72];

console.log(cosineSimilarity(v1, v2).toFixed(4)); // 0.9998 (very similar!)
console.log(cosineSimilarity(v1, v3).toFixed(4)); // 0.6341 (different)

Best for: Text similarity, semantic search, RAG

2. Euclidean Distance

Measures the straight-line distance between two points in space.

Lower = more similar

javascript

function euclideanDistance(a, b) {
  return Math.sqrt(a.reduce((sum, ai, i) => sum + Math.pow(ai - b[i], 2), 0));
}

console.log(euclideanDistance(v1, v2).toFixed(4)); // 0.0224 (very close!)
console.log(euclideanDistance(v1, v3).toFixed(4)); // 0.9503 (far apart)

Best for: Image embeddings, spatial data, clustering

3. Dot Product

The raw multiplication sum of two vectors.

javascript

function dotProduct(a, b) {
  return a.reduce((sum, ai, i) => sum + ai * b[i], 0);
}

Best for: Recommendation systems (when vectors are normalized)

Comparison

🔬 Anatomy of a Vector Embedding

Each number in the vector represents an abstract learned feature. Dimensions don't have human-readable labels — the neural network learns them during training.

Dimension 42 might encode "royalty", dimension 107 might encode "gender", but we can't know for sure — the model decides.

💻 Generating Embeddings — Code Examples

Text Embedding (OpenAI)

javascript

// npm install openai
const { OpenAI } = require("openai");
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

async function embedText(text) {
  const res = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: text,
  });
  return res.data[0].embedding; // float[] with 1536 values
}

const v1 = await embedText("I love machine learning");
const v2 = await embedText("I enjoy deep learning");
const v3 = await embedText("The stock market crashed today");

// v1 and v2 will be close, v3 will be far from both

Batch Embedding (Multiple Texts at Once)

javascript

async function embedBatch(texts) {
  const res = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: texts, // Pass array directly — more efficient!
  });
  return res.data.map((d) => d.embedding);
}

const sentences = [
  "How do I reset my password?",
  "I forgot my login credentials",
  "What is the refund policy?",
  "Can I get my money back?",
];

const vectors = await embedBatch(sentences);
// vectors[0] ≈ vectors[1]  (both about password/login)
// vectors[2] ≈ vectors[3]  (both about refunds)

Local Embedding (No API — Free)

javascript

// npm install @xenova/transformers
import { pipeline } from "@xenova/transformers";

const embedder = await pipeline(
  "feature-extraction",
  "Xenova/all-MiniLM-L6-v2"
);

async function embedLocal(text) {
  const output = await embedder(text, { pooling: "mean", normalize: true });
  return Array.from(output.data); // 384-dimensional vector
}

const vec = await embedLocal("Hello world");
console.log(vec.length); // 384

🗺️ When to Use Embeddings

✅ Use Embeddings When:

Scenario	Why Embeddings Help
Semantic search	Find results by meaning, not just keywords
Chatbot / RAG	Retrieve relevant context for LLM answers
Recommendation	Suggest similar products / articles / songs
Clustering	Group documents by topic automatically
Zero-shot classification	Classify text without labeled training data
Duplicate detection	Find near-identical content across large corpus
Cross-language search	Match Spanish query with English docs
Code search	Find function by describing what it does

❌ Don't Use Embeddings When:

You only need exact keyword matching (use full-text search)
You need structured query filters (price < 100, status = active)
You have very limited compute — embeddings add latency
Your dataset is tiny (< 1000 items) — simpler methods work fine

🔄 Full Pipeline — From Raw Data to Search

📦 Real-World Example: Duplicate Ticket Detector

Problem

A support team receives thousands of tickets daily. Many are duplicates. Manual review is impossible.

Solution — Embedding-Based Deduplication

javascript

const { OpenAI } = require("openai");
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// ── Embed a support ticket ──────────────────────────────
async function embed(text) {
  const res = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: text,
  });
  return res.data[0].embedding;
}

// ── Cosine similarity helper ────────────────────────────
function similarity(a, b) {
  const dot = a.reduce((s, v, i) => s + v * b[i], 0);
  const mag = (v) => Math.sqrt(v.reduce((s, x) => s + x * x, 0));
  return dot / (mag(a) * mag(b));
}

// ── Check if a new ticket is duplicate ─────────────────
async function isDuplicate(newTicket, existingTickets, threshold = 0.92) {
  const newVec = await embed(newTicket.text);

  for (const existing of existingTickets) {
    const score = similarity(newVec, existing.vector);
    if (score >= threshold) {
      return {
        isDuplicate: true,
        matchId: existing.id,
        score: score.toFixed(4),
      };
    }
  }
  return { isDuplicate: false };
}

// ── Usage ───────────────────────────────────────────────
const existingTickets = [
  {
    id: "T-001",
    text: "I cannot login to my account",
    vector: await embed("I cannot login to my account"),
  },
];

const newTicket = { text: "Unable to sign in to my profile" };
const result = await isDuplicate(newTicket, existingTickets);

console.log(result);
// { isDuplicate: true, matchId: "T-001", score: "0.9541" }
// ✅ Flagged as duplicate — same intent, different words!

🧮 Embedding Dimensions — Trade-offs

Dimensions	Model Example	Speed	Accuracy	Use Case
384	MiniLM-L6-v2	⚡⚡⚡	⭐⭐	Real-time, edge, mobile
768	BERT, gecko	⚡⚡	⭐⭐⭐	General NLP
1536	ada-002, embed-3-sm	⚡	⭐⭐⭐⭐	Production RAG
3072	embed-3-large	🐢	⭐⭐⭐⭐⭐	High-stakes similarity

🌍 Multimodal Embeddings — Same Space, Different Data

One of the most powerful ideas: embed text and images into the same vector space (e.g., OpenAI CLIP).

Use cases:

Search images by typing a description
Find images most similar to another image
Auto-tagging / captioning images

📊 Embedding Quality Checklist

🔁 Chunking Strategy for Long Documents

javascript

// Long documents must be split before embedding
function chunkText(text, chunkSize = 500, overlap = 50) {
  const words = text.split(" ");
  const chunks = [];

  for (let i = 0; i < words.length; i += chunkSize - overlap) {
    const chunk = words.slice(i, i + chunkSize).join(" ");
    chunks.push({ text: chunk, start: i, end: i + chunkSize });
    if (i + chunkSize >= words.length) break;
  }
  return chunks;
}

// Embed each chunk separately
async function embedDocument(fullText) {
  const chunks = chunkText(fullText, 500, 50);
  const results = [];

  for (const chunk of chunks) {
    const vector = await embed(chunk.text);
    results.push({ ...chunk, vector });
  }

  return results; // array of { text, vector, start, end }
}

🆚 Embedding Models Compared

Criterion	`text-embedding-3-small`	`all-MiniLM-L6-v2`	`text-embedding-gecko`
Provider	OpenAI (API)	HuggingFace (local)	Google (API)
Dimensions	1536	384	768
Cost	Paid	Free	Paid
Latency	~200ms	~5ms (local)	~150ms
Multilingual	✅ Yes	⚠️ Partial	✅ Yes
Best For	Production RAG	Local / Edge	Google Cloud stack

✅ Checklist Before Moving On

[ ] I can explain what a vector embedding is in plain English
[ ] I understand why embeddings capture meaning, not just words
[ ] I know the difference between cosine, euclidean, and dot product
[ ] I can generate embeddings using OpenAI API in JavaScript
[ ] I understand the chunking strategy for long documents
[ ] I know which embedding model to pick for different scenarios
[ ] I understand multimodal embeddings (CLIP)

📚 Further Reading

OpenAI Embeddings Guide — Official API docs
Sentence Transformers — Best open-source embedding library
The Illustrated Word2Vec — Visual intuition
CLIP Paper (OpenAI) — Multimodal embeddings

➡️ Next: ANN & HNSW Index → — then Vector Databases →

🔢 Vector Embeddings — Complete Guide ​

What is a Vector Embedding? ​

🧠 The Intuition — From Words to Numbers ​

Step 1 — One-Hot Encoding (naive, broken) ​

Step 2 — Dense Embeddings (modern, powerful) ​

📐 Geometry of Embeddings ​

⚙️ How Embeddings Are Created ​

Popular Embedding Models ​

📏 Similarity Metrics — How to Measure "Closeness" ​

1. Cosine Similarity ​

2. Euclidean Distance ​

3. Dot Product ​

Comparison ​

🔬 Anatomy of a Vector Embedding ​

💻 Generating Embeddings — Code Examples ​

Text Embedding (OpenAI) ​

Batch Embedding (Multiple Texts at Once) ​

Local Embedding (No API — Free) ​

🗺️ When to Use Embeddings ​

✅ Use Embeddings When: ​

❌ Don't Use Embeddings When: ​

🔄 Full Pipeline — From Raw Data to Search ​

📦 Real-World Example: Duplicate Ticket Detector ​

Problem ​

Solution — Embedding-Based Deduplication ​

🧮 Embedding Dimensions — Trade-offs ​

🌍 Multimodal Embeddings — Same Space, Different Data ​

📊 Embedding Quality Checklist ​

🔁 Chunking Strategy for Long Documents ​

🆚 Embedding Models Compared ​

✅ Checklist Before Moving On ​

📚 Further Reading ​

🔢 Vector Embeddings — Complete Guide

What is a Vector Embedding?

🧠 The Intuition — From Words to Numbers

Step 1 — One-Hot Encoding (naive, broken)

Step 2 — Dense Embeddings (modern, powerful)

📐 Geometry of Embeddings

⚙️ How Embeddings Are Created

Popular Embedding Models

📏 Similarity Metrics — How to Measure "Closeness"

1. Cosine Similarity

2. Euclidean Distance

3. Dot Product

Comparison

🔬 Anatomy of a Vector Embedding

💻 Generating Embeddings — Code Examples

Text Embedding (OpenAI)

Batch Embedding (Multiple Texts at Once)

Local Embedding (No API — Free)

🗺️ When to Use Embeddings

✅ Use Embeddings When:

❌ Don't Use Embeddings When:

🔄 Full Pipeline — From Raw Data to Search

📦 Real-World Example: Duplicate Ticket Detector

Problem

Solution — Embedding-Based Deduplication

🧮 Embedding Dimensions — Trade-offs

🌍 Multimodal Embeddings — Same Space, Different Data

📊 Embedding Quality Checklist

🔁 Chunking Strategy for Long Documents

🆚 Embedding Models Compared

✅ Checklist Before Moving On

📚 Further Reading