Vector Databases · AI Stack Deep Dive

Quick Facts

At a Glance

Basic Concepts

Embedding: a vector (e.g. 1024 floats) representing the meaning of some content.
Similarity = distance between two vectors (cosine, dot product, Euclidean).
ANN (Approximate Nearest Neighbor): fast similarity search across millions of vectors.
Index: a data structure (HNSW, IVF, ScaNN) that makes search sub-linear.
Hybrid search combines vector similarity with keyword (BM25) matching for the best of both.

Landscape

The Major Options

Database	Type	Why pick it
Pinecone	Managed SaaS	Easiest production setup; auto-scaling, no ops.
Weaviate	Open-source / cloud	Built-in vectorizer modules; GraphQL API.
Qdrant	Open-source / cloud	Rust-built; fast filtering + payloads.
Milvus / Zilliz	Open-source / cloud	Battle-tested at billion-vector scale.
Chroma	Open-source / lite	Local-first; favorite for prototyping.
pgvector (Postgres)	Extension	Reuse your existing SQL DB; transactions + vectors.
Elasticsearch / OpenSearch	Search engine + vectors	Hybrid search; you may already run it.
Redis (RediSearch)	In-memory + vectors	Sub-millisecond ANN for hot data.
MongoDB Atlas Vector Search	Document DB + vectors	Vectors alongside JSON documents.
LanceDB	Embedded / Rust	File-based, Parquet-friendly, multi-modal.
FAISS	Library (Meta)	The original ANN library; powers many of the above.

Mechanics

How They Work

Embeddings — From Text to Vectors

An embedding model (OpenAI text-embedding-3, Cohere Embed, Voyage, Nomic) takes a chunk of text and returns an N-dimensional vector. Semantically similar text → vectors that are close to each other.

from openai import OpenAI
client = OpenAI()
v = client.embeddings.create(
    model="text-embedding-3-small",
    input="How do I reset my password?",
).data[0].embedding   # list of 1536 floats

Indexing Algorithms

HNSW (Hierarchical Navigable Small World) — graph-based, very fast, the modern default.
IVF (Inverted File) — partition vectors into clusters, search only relevant ones.
PQ (Product Quantization) — compress vectors for memory savings.
ScaNN — Google's algorithm; powers Vertex AI Vector Search.

Filtering & Metadata

You rarely just want "similar things" — you want similar things belonging to user X, in the past 30 days, of type Y. Modern vector DBs support metadata filters that prune the search space before (or during) the ANN search.

Hybrid Search & Re-ranking

Hybrid: combine BM25 keyword scores with vector similarity (better recall on rare terms / IDs).
Re-rankers: a second-pass cross-encoder (Cohere Rerank, Voyage Rerank) that re-orders the top-N for accuracy.
Reciprocal Rank Fusion (RRF) blends multiple result lists.

Choosing One

Scenario	Pick
Already on Postgres, < 10M vectors	pgvector
Hosted, zero ops	Pinecone, Vertex AI Vector Search
Self-host, billions of vectors	Milvus, Qdrant
Already running Elasticsearch	Elasticsearch / OpenSearch
Local prototyping	Chroma, LanceDB
Need it on the edge	SQLite + sqlite-vec, LanceDB

Continue

Other AI Stack Layers

Foundation Models Model Providers Frameworks Dev Agents MLOps Classic ML Data Prep ↑ Back to AI Landscape