Skip to main content

Vector databases

Dated content — June 2026

This page names specific tools, models, and prices, which rotate quarterly. The selection logic is durable; the names are a snapshot. Cross-check the Model snapshot for current model names and pricing.

In one line: Storage with a nearest-neighbor index over high-dimensional vectors. Pick mostly on your team's ops appetite and your corpus size — the algorithms are commoditized.

In plain English

A vector database is just a regular database that also knows how to answer "give me the 10 rows whose vector is closest to this vector," fast. The "fast" is the magic — naive scan over a million vectors is 200ms+; with an HNSW or IVF index it's 2ms. Everything else (filters, hybrid search, persistence) is what you'd want from any database. In 2026 the big choice isn't which algorithm — they all use HNSW or a variant — it's "do I want a new service, or can I use the Postgres I already have?"

The major options (2026)

DBHostingScale sweet spotHybrid searchFiltersCost shape
pgvector (Postgres extension)self / Supabase / Neon< 10M vectorswith pg_trgm / FTSexcellent (SQL)Postgres cost
Pineconehosted only1M–1B+yes (sparse-dense)good$$ (pod / serverless)
Weaviateself / hosted1M–100Mexcellent (BM25 + vector)goodOSS / hosted
Qdrantself / hosted1M–100MyesexcellentOSS / cheap hosted
Chromaembedded / hosted< 1M (prod)basicbasicOSS, free embedded
Turbopufferhosted only100M+, many tenantsyesgoodobject-storage cheap
Milvus / Zillizself / hosted100M–10B+yesgoodenterprise
LanceDBembedded + cloud1M–1B (columnar)yesgoodOSS / cloud
Vespaself / hostedextreme scalebest-in-classbest-in-classcomplex to operate
Elasticsearch / OpenSearchself / hosted1M–100Mmature BM25 + vectorexcellentfamiliar to ops

Default pick for most teams

pgvector. If you already run Postgres (and you probably do), turn on the extension, add a column, and you have a vector store. One service to back up, one mental model, SQL filters out of the box. This is the right starting point until you have a measured reason to leave.

When you do leave: Pinecone if you want zero ops and you'll pay for it, Qdrant if you want OSS and you're happy operating one extra service, Turbopuffer if you have many tenants and most data is cold.

When to deviate

  • >10M vectors with hot QPS: pgvector's index build time and recall start to wobble. Move to Qdrant, Weaviate, or Pinecone.
  • Hybrid search is the core feature (not an add-on): Weaviate or Vespa ship it with the cleanest defaults.
  • Multi-tenant SaaS with thousands of small corpora: Turbopuffer is built for exactly this; pgvector and Pinecone both struggle with index-per-tenant economics.
  • Embedded / prototype / no service: Chroma or LanceDB.
  • Already in Elastic for logs / search: just turn on dense-vector and stay in Elasticsearch.
  • Truly massive (billions of vectors, complex ranking): Vespa or Milvus.

Minimum integration

pgvector — the boring, correct default:

CREATE EXTENSION vector;

CREATE TABLE docs (
id bigserial PRIMARY KEY,
text text NOT NULL,
embedding vector(1536),
metadata jsonb
);

CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops);
# Insert
db.execute("INSERT INTO docs (text, embedding) VALUES (%s, %s)", [text, vec])

# Top-5 nearest neighbors
rows = db.execute(
"SELECT text FROM docs ORDER BY embedding <=> %s LIMIT 5",
[query_vec],
).fetchall()

Qdrant — when you outgrow pgvector:

from qdrant_client import QdrantClient
from qdrant_client.http.models import Distance, VectorParams, PointStruct

client = QdrantClient(host="localhost", port=6333)
client.create_collection("docs", vectors_config=VectorParams(size=1536, distance=Distance.COSINE))

client.upsert("docs", points=[PointStruct(id=1, vector=vec, payload={"text": text})])

hits = client.search("docs", query_vector=query_vec, limit=5)

The 2026 consensus: combine BM25 (keyword) with vector (semantic) and fuse the rankings. Vector search alone misses exact-match cases ("error code E-417"); BM25 alone misses paraphrasing ("how do I reset my password" vs. "rotate credentials"). Most production-grade vector DBs ship hybrid (Weaviate, Qdrant, Pinecone sparse-dense, Vespa); pgvector users typically pair with Postgres FTS or ParadeDB (pg_search).

Then rerank the top-50 with a cross-encoder (Cohere Rerank 3, Voyage Rerank) down to top-5. That's the single biggest RAG quality win for the cost.

Pricing & cost notes (May 2026 ballpark)

DBCost for 1M vectors @ 1536d, modest QPS
pgvector (Supabase / Neon small)$0–$50/mo (your Postgres bill)
Pinecone Serverless~$50–$150/mo
Qdrant Cloud (1 node)~$80/mo
Weaviate Cloud (sandbox)~$25–$100/mo
Turbopuffer~$10–$50/mo (cold data dirt cheap)
Self-hosted Qdrant on a VM$20/mo (just the VM)

At 10M+ vectors the picture flips: hosted Pinecone/Weaviate get expensive, and self-hosted Qdrant or Turbopuffer dominate cost.

Pitfalls

  • Adding a vector DB before you've shipped one RAG feature. pgvector inside the Postgres you already run beats every standalone choice for v0.
  • Wrong distance metric. Cosine for normalized embeddings, L2 for raw. Pick wrong and the rankings are subtly garbage.
  • No HNSW index, querying at full scan. "Why is retrieval 800ms?" Because you forgot CREATE INDEX.
  • Tuning HNSW parameters by vibes. m and ef_construction change build time, recall, and memory non-trivially. Read the docs; benchmark on your data.
  • Storing only the vector, no metadata. You can't filter by tenant, language, or doc type, and you can't show the user what they retrieved. Always store the source text + metadata next to the vector.
  • No re-embedding plan. Six months in, you'll want to swap embedding models. Without an embedding_model_id column, you can't tell which rows are which.
  • Treating Chroma as production. Chroma is a fantastic prototype DB. It is not where you put your million-document corpus.
  • Not measuring recall. A vector DB with 70% recall@10 looks fine in demos and breaks subtly in production. Build a small labeled set and measure.
🤔 Quick checkQuick check

→ Next: Document parsing