What are vector databases?

Vector databases store embeddings and enable similarity search. They power semantic search, recommendations, and retrieval-augmented generation.

How do you search by meaning instead of keywords?

Traditional search matches keywords. Search for "car" and you get documents containing "car." Miss documents about "automobile" or "vehicle" even though they're relevant.

Semantic search matches meaning. It finds documents that are conceptually similar, regardless of exact wording. The secret: convert text to embeddings (vectors of numbers) and search for nearby vectors.

This requires storing and searching millions of vectors efficiently. That's what vector databases do.

How vector search works

Index time: Convert your documents to embeddings. Store them in the database with their vectors.
Query time: Convert the search query to an embedding using the same model.
Search: Find vectors in the database closest to the query vector.
Return: Return the documents associated with those vectors.

"Closest" is typically measured by cosine similarity or Euclidean distance. Vectors pointing in similar directions (cosine) or near each other (Euclidean) represent similar meanings.

Why not just compare every vector?

If you have a million documents, comparing a query to all million vectors is slow. Vector databases use clever indexing to make this fast.

Approximate nearest neighbor (ANN) algorithms trade perfect accuracy for speed. They find vectors that are probably closest, very quickly. For most applications, approximate is fine.

Popular algorithms include:

HNSW (Hierarchical Navigable Small World): Graph-based, very fast
IVF (Inverted File Index): Clusters vectors, searches relevant clusters
PQ (Product Quantization): Compresses vectors for memory efficiency

These techniques enable searching billions of vectors in milliseconds.

Powering RAG and AI applications

Vector databases are the backbone of Retrieval-Augmented Generation (RAG): a pattern where LLMs retrieve relevant information before generating responses. When you ask an AI about your documents, a vector database finds the relevant chunks to include in the prompt.

This is what enables AI to answer questions about content it was never trained on: your company docs, your codebase, your personal notes. The vector database provides the retrieval; the LLM provides the reasoning.

→ How RAG fits into AI applications

Beyond RAG: other applications

Vector databases enable many applications:

Recommendations: Find similar products, articles, users
Deduplication: Identify near-duplicate content
Clustering: Group similar items automatically
Anomaly detection: Find outliers far from typical vectors
Image search: Embed images, search by visual similarity
Code search: Find semantically similar code snippets

Any domain where "similarity" matters can benefit from vector search.

Limitations

Vector search isn't perfect:

Embedding quality matters: Garbage embeddings mean garbage search
Context loss: Chunking documents loses some context
Semantic vs lexical: Sometimes exact keyword match is what you need
Cold start: New content needs embedding before it's searchable
Maintenance: Embeddings may need regenerating when models update

Vector databases are powerful but not a complete replacement for traditional search. Often, hybrid approaches combining keyword and semantic search work best.

Sources & Further Reading

🔗 Article

What is a Vector Database?

Pinecone

📄 Paper

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Lewis et al. · Meta AI · 2020

📖 Docs

Vector Search

Weaviate