What are vector databases?
Vector databases store embeddings and enable similarity search. They power semantic search, recommendations, and retrieval-augmented generation.
How do you search by meaning instead of keywords?
Traditional search matches keywords. Search for "car" and you get documents containing "car." Miss documents about "automobile" or "vehicle" even though they're relevant.
Semantic search matches meaning. It finds documents that are conceptually similar, regardless of exact wording. The secret: convert text to embeddings (vectors of numbers) and search for nearby vectors.
This requires storing and searching millions of vectors efficiently. That's what vector databases do.
How vector search works
- Index time: Convert your documents to embeddings. Store them in the database with their vectors.
- Query time: Convert the search query to an embedding using the same model.
- Search: Find vectors in the database closest to the query vector.
- Return: Return the documents associated with those vectors.
"Closest" is typically measured by cosine similarity or Euclidean distance. Vectors pointing in similar directions (cosine) or near each other (Euclidean) represent similar meanings.
Why not just compare every vector?
If you have a million documents, comparing a query to all million vectors is slow. Vector databases use clever indexing to make this fast.
Approximate nearest neighbor (ANN) algorithms trade perfect accuracy for speed. They find vectors that are probably closest, very quickly. For most applications, approximate is fine.
Popular algorithms include:
- HNSW (Hierarchical Navigable Small World): Graph-based, very fast
- IVF (Inverted File Index): Clusters vectors, searches relevant clusters
- PQ (Product Quantization): Compresses vectors for memory efficiency
These techniques enable searching billions of vectors in milliseconds.
Powering RAG and AI applications
Vector databases are the backbone of Retrieval-Augmented Generation (RAG): a pattern where LLMs retrieve relevant information before generating responses. When you ask an AI about your documents, a vector database finds the relevant chunks to include in the prompt.
This is what enables AI to answer questions about content it was never trained on: your company docs, your codebase, your personal notes. The vector database provides the retrieval; the LLM provides the reasoning.
โ How RAG fits into AI applications
Beyond RAG: other applications
Vector databases enable many applications:
- Recommendations: Find similar products, articles, users
- Deduplication: Identify near-duplicate content
- Clustering: Group similar items automatically
- Anomaly detection: Find outliers far from typical vectors
- Image search: Embed images, search by visual similarity
- Code search: Find semantically similar code snippets
Any domain where "similarity" matters can benefit from vector search.
Limitations
Vector search isn't perfect:
- Embedding quality matters: Garbage embeddings mean garbage search
- Context loss: Chunking documents loses some context
- Semantic vs lexical: Sometimes exact keyword match is what you need
- Cold start: New content needs embedding before it's searchable
- Maintenance: Embeddings may need regenerating when models update
Vector databases are powerful but not a complete replacement for traditional search. Often, hybrid approaches combining keyword and semantic search work best.