🔍 Vector Search Deep Dive
Understanding how systems find the "closest" match across different search paradigms.
When we talk about "Vector Search", we're usually referring to finding similar items in a high-dimensional space. However, not all vector searches are the same. Modern search architectures use multiple types of search depending on what needs to be found (exact keywords vs. overall meaning).
1. Dense Vector Search (Semantic Search)
This is the most common type of "Vector Search" associated with modern AI. It uses dense embeddings (arrays with hundreds or thousands of non-zero numbers, e.g., OpenAI embeddings) to capture deep semantic meaning.
How it works: The entire query is converted into a single vector. The database finds other vectors that point in the exact same mathematical direction.
Example:
- Query: "Smartphone for taking great pictures"
- Match: "Google Pixel 8 Pro - Excellent Camera" (Matches despite having no shared keywords!)
2. Sparse Vector Search (Keyword Search)
Before LLMs, search engines (like Elasticsearch) used Sparse Vectors (like BM25 or TF-IDF). A sparse vector has millions of dimensions (one for every word in the dictionary), but almost all of them are 0 except for the words that actually appear in the text.
How it works: It matches exact words and handles term frequency. It does NOT understand meaning.
Example:
- Query: "Error 502 Bad Gateway"
- Match: A server log file containing the exact string "Error 502 Bad Gateway". (A dense search might mistakenly return "Error 404 Not Found" because it's semantically a similar concept: "web error").
3. Hybrid Search (Dense + Sparse)
The gold standard for modern production search. Dense search is bad at exact matches (like serial numbers, names, or specific error codes). Sparse search is bad at synonyms and concepts. Hybrid Search does both and mathematically combines the scores.
How it works: It runs a Dense Search and a Sparse Search simultaneously, then merges the results using an algorithm called Reciprocal Rank Fusion (RRF).
Example:
- Query: "How to fix Error 502 in Nginx"
- Sparse (Keyword) finds exact matches for "Error 502" and "Nginx".
- Dense (Semantic) understands the intent is "troubleshooting a web server proxy issue".
- Hybrid combines them to give you the perfect troubleshooting guide.
4. Multi-Modal Vector Search
Vectors aren't just for text. You can embed images, audio, and video into the exact same vector space using models like CLIP (Contrastive Language-Image Pretraining).
How it works: The AI model is trained so that a photo of a dog and the text sentence "A picture of a dog" result in the exact same vector coordinates.
Example Types:
- Text-to-Image: Type "red dress", find photos of red dresses.
- Image-to-Image: Upload a photo of a chair, find visually similar chairs in the catalog.
The Algorithms: How it actually searches
Once you have your vectors, how does the database quickly search through a billion of them? It uses one of two methods:
1. k-Nearest Neighbors (kNN) - Exact Search
- How it works: Calculates the mathematical distance (e.g., Cosine Similarity) between the query vector and every single vector in the database.
- Pros: 100% accurate. You will always find the true nearest neighbor.
- Cons: Incredibly slow and computationally expensive. O(N) complexity.
- When to use: Small datasets (< 100k vectors).
2. Approximate Nearest Neighbors (ANN) - Fast Search
- How it works: Uses smart indexing structures (like HNSW - Hierarchical Navigable Small World) to create graphs. It only searches a small subset of the vectors that are "likely" to be close.
- Pros: Lightning fast. O(log N) complexity. Can search billions of vectors in milliseconds.
- Cons: Returns an approximate result. It might miss the absolute closest vector (e.g., 98% recall accuracy instead of 100%).
- When to use: Production systems, large datasets, real-time search.
Summary Cheat Sheet
| Search Type | What it Looks For | Best For | Weakness | Example Tech |
|---|---|---|---|---|
| Dense Search | Meaning & Context | NLP, Chatbots, Intent-based search | Bad at exact names, IDs, acronyms | OpenAI, Pinecone, FAISS |
| Sparse Search | Exact Keywords | Searching logs, specific part numbers | Misses typos, synonyms, intent | BM25, Elasticsearch, Solr |
| Hybrid Search | Both | E-commerce, Enterprise Search | More complex to set up/tune | Weaviate, Pinecone, Qdrant |
| Multi-Modal | Visual/Audio Meaning | Reverse image search, Pinterest | Requires specialized models (CLIP) | Qdrant, Weaviate |
