Skip to content

๐Ÿ”ข Vector Embeddings โ€” Complete Guide โ€‹

Turning meaning into math โ€” the foundation of modern AI systems.


What is a Vector Embedding? โ€‹

A vector embedding is a numerical representation of real-world data (text, image, audio, video, code) as a fixed-size array of floating-point numbers.

The key insight: similar things produce similar numbers.

text
Word: "King"   โ†’ [0.81, 0.22, 0.67, 0.14, ...]  โ”€โ”
Word: "Queen"  โ†’ [0.79, 0.25, 0.65, 0.19, ...]  โ”€โ”˜ nearby in space
Word: "Apple"  โ†’ [0.12, 0.91, 0.03, 0.88, ...]    far away

The model doesn't just store the word โ€” it encodes its meaning, relationships, and context into a point in high-dimensional space.


๐Ÿง  The Intuition โ€” From Words to Numbers โ€‹

Step 1 โ€” One-Hot Encoding (naive, broken) โ€‹

The old way: represent each word as a sparse binary vector.

text
Vocabulary: [cat, dog, king, queen, apple]

"cat"   = [1, 0, 0, 0, 0]
"dog"   = [0, 1, 0, 0, 0]
"king"  = [0, 0, 1, 0, 0]
"queen" = [0, 0, 0, 1, 0]

Problems:

  • No relationship encoded โ€” "king" and "queen" look as different as "cat" and "apple"
  • Scales to millions of dimensions (one per word) โ€” unusable
  • No semantic similarity captured

Step 2 โ€” Dense Embeddings (modern, powerful) โ€‹

A neural network learns to compress meaning into a small dense vector:

text
"king"  โ†’ [0.81, 0.22, 0.67, 0.14, 0.55, ...]  (300-1536 numbers)
"queen" โ†’ [0.79, 0.25, 0.65, 0.19, 0.53, ...]  (very close!)
"apple" โ†’ [0.12, 0.91, 0.03, 0.88, 0.11, ...]  (far away)

The famous analogy holds: King โˆ’ Man + Woman โ‰ˆ Queen


๐Ÿ“ Geometry of Embeddings โ€‹

In reality, embeddings live in hundreds to thousands of dimensions โ€” the geometry is the same, just much richer.


โš™๏ธ How Embeddings Are Created โ€‹

ModelProviderDimensionsBest For
text-embedding-ada-002OpenAI1536General text, RAG
text-embedding-3-smallOpenAI1536Cost-efficient text
text-embedding-3-largeOpenAI3072High-accuracy text
text-embedding-geckoGoogle768Multilingual text
BERT / RoBERTaHuggingFace768Open-source NLP
CLIPOpenAI512Text + Image (multimodal)
Whisper embeddingsOpenAI1280Audio โ†’ vector
all-MiniLM-L6-v2SBERT384Fast, local, sentence similarity

๐Ÿ“ Similarity Metrics โ€” How to Measure "Closeness" โ€‹

Two vectors can be compared using different distance formulas depending on the use case.

1. Cosine Similarity โ€‹

Measures the angle between two vectors. Ignores magnitude, cares about direction.

Range: -1 (opposite) โ†’ 0 (unrelated) โ†’ 1 (identical)

javascript
function cosineSimilarity(a, b) {
  const dot = a.reduce((sum, ai, i) => sum + ai * b[i], 0);
  const magA = Math.sqrt(a.reduce((sum, ai) => sum + ai * ai, 0));
  const magB = Math.sqrt(b.reduce((sum, bi) => sum + bi * bi, 0));
  return dot / (magA * magB);
}

const v1 = [0.21, 0.87, 0.43];
const v2 = [0.22, 0.85, 0.44];
const v3 = [0.91, 0.03, 0.72];

console.log(cosineSimilarity(v1, v2).toFixed(4)); // 0.9998 (very similar!)
console.log(cosineSimilarity(v1, v3).toFixed(4)); // 0.6341 (different)

Best for: Text similarity, semantic search, RAG

2. Euclidean Distance โ€‹

Measures the straight-line distance between two points in space.

Lower = more similar

javascript
function euclideanDistance(a, b) {
  return Math.sqrt(a.reduce((sum, ai, i) => sum + Math.pow(ai - b[i], 2), 0));
}

console.log(euclideanDistance(v1, v2).toFixed(4)); // 0.0224 (very close!)
console.log(euclideanDistance(v1, v3).toFixed(4)); // 0.9503 (far apart)

Best for: Image embeddings, spatial data, clustering

3. Dot Product โ€‹

The raw multiplication sum of two vectors.

javascript
function dotProduct(a, b) {
  return a.reduce((sum, ai, i) => sum + ai * b[i], 0);
}

Best for: Recommendation systems (when vectors are normalized)

Comparison โ€‹


๐Ÿ”ฌ Anatomy of a Vector Embedding โ€‹

Each number in the vector represents an abstract learned feature. Dimensions don't have human-readable labels โ€” the neural network learns them during training.

Dimension 42 might encode "royalty", dimension 107 might encode "gender", but we can't know for sure โ€” the model decides.


๐Ÿ’ป Generating Embeddings โ€” Code Examples โ€‹

Text Embedding (OpenAI) โ€‹

javascript
// npm install openai
const { OpenAI } = require("openai");
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

async function embedText(text) {
  const res = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: text,
  });
  return res.data[0].embedding; // float[] with 1536 values
}

const v1 = await embedText("I love machine learning");
const v2 = await embedText("I enjoy deep learning");
const v3 = await embedText("The stock market crashed today");

// v1 and v2 will be close, v3 will be far from both

Batch Embedding (Multiple Texts at Once) โ€‹

javascript
async function embedBatch(texts) {
  const res = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: texts, // Pass array directly โ€” more efficient!
  });
  return res.data.map((d) => d.embedding);
}

const sentences = [
  "How do I reset my password?",
  "I forgot my login credentials",
  "What is the refund policy?",
  "Can I get my money back?",
];

const vectors = await embedBatch(sentences);
// vectors[0] โ‰ˆ vectors[1]  (both about password/login)
// vectors[2] โ‰ˆ vectors[3]  (both about refunds)

Local Embedding (No API โ€” Free) โ€‹

javascript
// npm install @xenova/transformers
import { pipeline } from "@xenova/transformers";

const embedder = await pipeline(
  "feature-extraction",
  "Xenova/all-MiniLM-L6-v2"
);

async function embedLocal(text) {
  const output = await embedder(text, { pooling: "mean", normalize: true });
  return Array.from(output.data); // 384-dimensional vector
}

const vec = await embedLocal("Hello world");
console.log(vec.length); // 384

๐Ÿ—บ๏ธ When to Use Embeddings โ€‹

โœ… Use Embeddings When: โ€‹

ScenarioWhy Embeddings Help
Semantic searchFind results by meaning, not just keywords
Chatbot / RAGRetrieve relevant context for LLM answers
RecommendationSuggest similar products / articles / songs
ClusteringGroup documents by topic automatically
Zero-shot classificationClassify text without labeled training data
Duplicate detectionFind near-identical content across large corpus
Cross-language searchMatch Spanish query with English docs
Code searchFind function by describing what it does

โŒ Don't Use Embeddings When: โ€‹

  • You only need exact keyword matching (use full-text search)
  • You need structured query filters (price < 100, status = active)
  • You have very limited compute โ€” embeddings add latency
  • Your dataset is tiny (< 1000 items) โ€” simpler methods work fine


๐Ÿ“ฆ Real-World Example: Duplicate Ticket Detector โ€‹

Problem โ€‹

A support team receives thousands of tickets daily. Many are duplicates. Manual review is impossible.

Solution โ€” Embedding-Based Deduplication โ€‹

javascript
const { OpenAI } = require("openai");
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// โ”€โ”€ Embed a support ticket โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
async function embed(text) {
  const res = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: text,
  });
  return res.data[0].embedding;
}

// โ”€โ”€ Cosine similarity helper โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
function similarity(a, b) {
  const dot = a.reduce((s, v, i) => s + v * b[i], 0);
  const mag = (v) => Math.sqrt(v.reduce((s, x) => s + x * x, 0));
  return dot / (mag(a) * mag(b));
}

// โ”€โ”€ Check if a new ticket is duplicate โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
async function isDuplicate(newTicket, existingTickets, threshold = 0.92) {
  const newVec = await embed(newTicket.text);

  for (const existing of existingTickets) {
    const score = similarity(newVec, existing.vector);
    if (score >= threshold) {
      return {
        isDuplicate: true,
        matchId: existing.id,
        score: score.toFixed(4),
      };
    }
  }
  return { isDuplicate: false };
}

// โ”€โ”€ Usage โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
const existingTickets = [
  {
    id: "T-001",
    text: "I cannot login to my account",
    vector: await embed("I cannot login to my account"),
  },
];

const newTicket = { text: "Unable to sign in to my profile" };
const result = await isDuplicate(newTicket, existingTickets);

console.log(result);
// { isDuplicate: true, matchId: "T-001", score: "0.9541" }
// โœ… Flagged as duplicate โ€” same intent, different words!

๐Ÿงฎ Embedding Dimensions โ€” Trade-offs โ€‹

DimensionsModel ExampleSpeedAccuracyUse Case
384MiniLM-L6-v2โšกโšกโšกโญโญReal-time, edge, mobile
768BERT, geckoโšกโšกโญโญโญGeneral NLP
1536ada-002, embed-3-smโšกโญโญโญโญProduction RAG
3072embed-3-large๐ŸขโญโญโญโญโญHigh-stakes similarity

๐ŸŒ Multimodal Embeddings โ€” Same Space, Different Data โ€‹

One of the most powerful ideas: embed text and images into the same vector space (e.g., OpenAI CLIP).

Use cases:

  • Search images by typing a description
  • Find images most similar to another image
  • Auto-tagging / captioning images

๐Ÿ“Š Embedding Quality Checklist โ€‹


๐Ÿ” Chunking Strategy for Long Documents โ€‹

javascript
// Long documents must be split before embedding
function chunkText(text, chunkSize = 500, overlap = 50) {
  const words = text.split(" ");
  const chunks = [];

  for (let i = 0; i < words.length; i += chunkSize - overlap) {
    const chunk = words.slice(i, i + chunkSize).join(" ");
    chunks.push({ text: chunk, start: i, end: i + chunkSize });
    if (i + chunkSize >= words.length) break;
  }
  return chunks;
}

// Embed each chunk separately
async function embedDocument(fullText) {
  const chunks = chunkText(fullText, 500, 50);
  const results = [];

  for (const chunk of chunks) {
    const vector = await embed(chunk.text);
    results.push({ ...chunk, vector });
  }

  return results; // array of { text, vector, start, end }
}

๐Ÿ†š Embedding Models Compared โ€‹

Criteriontext-embedding-3-smallall-MiniLM-L6-v2text-embedding-gecko
ProviderOpenAI (API)HuggingFace (local)Google (API)
Dimensions1536384768
CostPaidFreePaid
Latency~200ms~5ms (local)~150ms
Multilingualโœ… Yesโš ๏ธ Partialโœ… Yes
Best ForProduction RAGLocal / EdgeGoogle Cloud stack

โœ… Checklist Before Moving On โ€‹

  • [ ] I can explain what a vector embedding is in plain English
  • [ ] I understand why embeddings capture meaning, not just words
  • [ ] I know the difference between cosine, euclidean, and dot product
  • [ ] I can generate embeddings using OpenAI API in JavaScript
  • [ ] I understand the chunking strategy for long documents
  • [ ] I know which embedding model to pick for different scenarios
  • [ ] I understand multimodal embeddings (CLIP)

๐Ÿ“š Further Reading โ€‹


โžก๏ธ Next: ANN & HNSW Index โ†’ โ€” then Vector Databases โ†’

Released under the ISC License.