Best Vector Databases in 2026

The definitive vector database landscape — 35+ tools across four tiers, head-to-head benchmarks, pricing at three scales, and a decision framework for choosing the right tool for your AI application.

Published · Updated

Our Recommendation

For most production RAG workloads, Qdrant offers the best balance of performance, operational simplicity, and developer experience — a single Rust binary with zero dependencies. If you already run PostgreSQL and have under 50M vectors, start with pgvector — the performance gap has narrowed dramatically with pgvectorscale's DiskANN indexes. Pinecone remains the easiest fully-managed option if you accept vendor lock-in. Weaviate wins for hybrid search (BM25 + vector). Milvus/Zilliz is the proven choice at billion-scale or when you need GPU acceleration. Turbopuffer is the cost-efficiency leader for multi-tenant workloads at scale. The broader trend: vector search is commoditizing into a feature, not a product category.

Comparison at a Glance

Pinecone Qdrant Weaviate Milvus Chroma pgvector LanceDB Turbopuffer Vespa Upstash Vector
Pricing freemium freemium freemium open-source freemium open-source freemium usage-based open-source usage-based
Starting Price $0 $0 $0 $0 $0 $0 $0 $64/mo $0 $0
Free Tier Yes Yes Yes Yes Yes Yes Yes No Yes Yes
Open Source No Yes Yes Yes Yes Yes Yes No Yes No
Self-Hosted No Yes Yes Yes Yes Yes Yes No Yes No
Cloud Hosted Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
Maturity established growing established established growing established growing growing established growing
Key Integrations
LangChain LlamaIndex OpenAI Cohere
LangChain LlamaIndex OpenAI Haystack
LangChain LlamaIndex OpenAI Cohere
LangChain LlamaIndex Haystack OpenAI
LangChain LlamaIndex OpenAI Anthropic
PostgreSQL LangChain LlamaIndex Supabase
LangChain LlamaIndex Pandas Polars
LangChain LlamaIndex Python SDK REST API
LangChain LlamaIndex Haystack Python SDK
LangChain LlamaIndex Vercel Cloudflare Workers

The vector database market in 2026

Vector databases have become infrastructure plumbing for AI applications, and the market is rapidly commoditizing. Purpose-built vector databases like Pinecone, Qdrant, Weaviate, and Milvus still lead at scale, but general-purpose databases — PostgreSQL with pgvector, Elasticsearch, Redis, MongoDB — now handle most production workloads under 50 million vectors with competitive performance. The biggest shift of 2025 was the rise of object-storage-first architectures (Turbopuffer) delivering 10–100x cost savings, and pgvector's maturation to the point where it genuinely competes with dedicated solutions for the median use case. Meanwhile, market consolidation is underway: Rockset was acquired by OpenAI and shut down, MyScaleDB folded, Pinecone is reportedly exploring a sale, and IBM acquired DataStax.

The full landscape: 35+ tools across four tiers

**Purpose-built vector databases** (11 tools): Pinecone, Milvus/Zilliz, Qdrant, Weaviate, Chroma, LanceDB, Vespa, Turbopuffer, Marqo, Upstash Vector, and Vald. These systems were designed ground-up for vector similarity search, with architectures optimized for ANN indexing, quantization, and high-throughput retrieval. **General-purpose databases with vector support** (12 tools): pgvector (PostgreSQL), Elasticsearch, OpenSearch, Redis Stack, MongoDB Atlas Vector Search, SingleStore, ClickHouse, MariaDB (11.8 LTS), DataStax Astra DB/Cassandra 5.0, DuckDB vss, Typesense, and Neo4j. Vector search here is an add-on feature — powerful for teams already invested in these ecosystems. **Cloud-native vector search services** (8 tools): Zilliz Cloud, Supabase and Neon (pgvector wrappers), Azure AI Search, Google Vertex AI Vector Search, Amazon OpenSearch Serverless, Cloudflare Vectorize, and Turso/sqlite-vec. **Deprecated or shut down**: Rockset (acquired by OpenAI, June 2024; shut down September 2024), MyScaleDB (ceased operations April 2025), and Annoy (Spotify, maintenance mode).

Head-to-head: the 'big four' purpose-built vector databases

**Pinecone** (Rust, closed-source, fully managed): Zero-ops leader with proprietary serverless architecture. Up to 20,000 dimensions, 20–100ms typical latency, 1,147 QPS on VDBBench. $50/month minimum on paid plans. Complete vendor lock-in but SOC 2/HIPAA/ISO 27001 compliant. $138M raised. **Qdrant** (Rust, Apache 2.0): Single binary, zero external dependencies. Integrated HNSW filtering applies filters during graph traversal — best-in-class for filtered search. Binary quantization for 32x memory reduction. Flat latency curve (22–24ms from k=1 to k=100). ~30K GitHub stars. $50M Series B (March 2026). **Weaviate** (Go, BSD 3-Clause): Best native hybrid search (BM25 + vector via Relative Score Fusion). Built-in vectorization modules. Four index types. Multi-tenancy with hot/warm/cold storage tiers. GraphQL API. ~16K GitHub stars. ~$68M raised. **Milvus** (C++/Go, Apache 2.0): Most feature-rich — 12+ index types, full GPU acceleration (NVIDIA CUDA), 4-layer disaggregated architecture. VDBBench leader at 9,704 QPS / 2.5ms p99 (via Zilliz Cloud). Won BigANN at NeurIPS 2021. ~43K GitHub stars. Tradeoff: requires etcd + MinIO + Pulsar for cluster mode. $113M raised.

Lightweight and Postgres-based options

**Chroma** targets rapid prototyping with a 3-lines-of-code developer experience. Rust core rewrite in 2025 delivered 4x faster performance. ~27,000 GitHub stars, 14M+ monthly pip downloads. Cloud pricing puts a 6M-doc workload at ~$79/month. Best for prototypes and moderate-scale workloads under 10M vectors. **LanceDB** runs embedded on the Lance columnar format — 100x faster random access than Parquet. IVF_PQ indexing scales beyond RAM, automatic dataset versioning with time-travel. Cloud charges ~$6 per million 1536D vectors. Netflix and Second Dinner are production users. ~9.7K GitHub stars. **pgvector** is the biggest story of 2025. Version 0.8.0 fixed overfiltering with iterative scans. pgvectorscale adds DiskANN-based indexing with 9x smaller indexes. On 50M embeddings: 471 QPS at 99% recall, beating Pinecone s1 by 16x on throughput at 75% lower cost. Practical ceiling: 10–50M vectors. Available on Supabase, Neon, AWS RDS, AlloyDB, Timescale. ~20.4K GitHub stars.

Emerging tools worth watching

**Turbopuffer** is the breakout of 2025 — built on object storage (S3/GCS) with a three-tier cache hierarchy delivering 10–100x cheaper multi-tenant search. Customers include Cursor, Anthropic, Notion (migrated from Pinecone), Linear, and Atlassian. Handles 2.5T+ documents. $64/month minimum, no free tier. **Vespa** is the most underappreciated tool here. Ranked #1 in GigaOm's 2025 Vector Database Radar, it combines vector search, BM25, tensor operations, and ML inference. Powers Spotify and Yahoo at billions of vectors. Open source (Apache 2.0). Steep learning curve but unmatched for hybrid search + custom ML ranking. **Upstash Vector** offers true pay-per-request pricing ($0.40/100K requests) with a generous free tier. Uses DiskANN for cost-effective disk-based search. Ideal for serverless/edge architectures with zero idle costs.

General-purpose databases with vector support

**Elasticsearch/OpenSearch**: Most mature general-purpose vector implementation. Lucene HNSW with int8/int4/BBQ quantization. Hybrid search (BM25 + vector via RRF fusion) is arguably better than most purpose-built vector DBs for mixed retrieval. VDBBench: OpenSearch at 3,055 QPS, Elastic at 1,925 QPS. **Redis Stack**: Sub-millisecond vector search latency via in-memory HNSW. Best raw latency for sub-10M vector workloads. Tradeoff: memory-intensive (~6GB RAM for 1M × 1536-dim vectors plus 2–3x for HNSW overhead). **MongoDB Atlas Vector Search**: HNSW with scalar/binary quantization in the aggregation pipeline. Pricing included in Atlas cluster costs. Adequate for teams already on MongoDB who want semantic search without new infrastructure. **SingleStore**: Most extensive index selection among general-purpose DBs (FLAT, IVF, HNSW). VLDB 2024 peer-reviewed paper. Claims 47–100x faster than pgvector IVFFlat. Enterprise-priced.

Performance benchmarks: what to trust

Every vendor publishes favorable benchmarks. Here's what's credible: **ANN-Benchmarks** (ann-benchmarks.com) is the gold standard for algorithm comparison but tests single-CPU, million-scale — it misses network latency, concurrency, filtering, and persistence. **VDBBench** (by Zilliz) tests full systems end-to-end including streaming performance. At ~$1K/month: Zilliz Cloud leads at 9,704 QPS / 2.5ms p99, followed by Milvus (3,465), OpenSearch (3,055), Qdrant (1,242), Pinecone (1,147), Elasticsearch (597). Caveat: Zilliz maintains VDBBench — methodology is open-source but verify independently. **The most important finding**: filtering degrades performance by 20–40%, and databases handle it very differently. Qdrant integrates filters into HNSW traversal. Milvus pre-filters on indexed fields. pgvector caps HNSW at 1000 candidates for filtered queries. Benchmark with your actual filter complexity.

Cost comparison at three scales

**Small prototype (~100K vectors)**: Free tiers everywhere — Pinecone Starter (2GB), Qdrant Cloud (1GB), Zilliz Cloud (1M vectors), Supabase (500MB), Neon (0.5 GiB). Self-hosted Qdrant on a $5–20/month VPS works too. **Mid-scale production (~10M vectors)**: $25–300/month. Self-hosted pgvector on Supabase Pro ($25/month + compute) is often cheapest. Turbopuffer excels for multi-tenant at ~$64–150/month. Pinecone Serverless $100–300/month. Qdrant Cloud ~$100–250/month. **Enterprise scale (~1B vectors)**: $1,000–10,000+/month. Pinecone Dedicated: $2K–10K+. Zilliz Dedicated: $1K–5K+. Self-hosted Milvus cluster: $1.5K–3K+. pgvector not recommended at this scale. Turbopuffer's object-storage architecture offers potentially transformative savings. The break-even between self-hosting and managed services is roughly 60–80M queries/month or ~100M vectors with high query volume.

Decision framework

**Already run PostgreSQL, <50M vectors?** → Start with pgvector. The performance gap versus dedicated vector databases has narrowed dramatically. **Need zero operational complexity?** → Pinecone. Unmatched developer experience, but budget for the $50/month minimum and watch for cost escalation. **Want open source with best cost-performance at moderate scale?** → Qdrant (simplest ops, best filtered search) or Weaviate (best hybrid search, multi-tenancy). **Building for billion-scale or need GPU?** → Milvus/Zilliz. Significant operational complexity for self-hosting, but Zilliz Cloud abstracts it away. **Cost efficiency is the primary concern at scale?** → Turbopuffer. 10–100x cheaper for multi-tenant workloads. Validated by Cursor, Notion, Anthropic, Linear. **Already on Elasticsearch, MongoDB, or Redis?** → Use their built-in vector search. For most workloads under tens of millions of vectors, the performance difference is smaller than the operational cost of adding another database.

All Tools in This Roundup

Pinecone

established

Fully managed vector database built for scale

freemium Free Tier
Zero-ops vector search at scale Serverless vector workloads Teams that don't want to manage infrastructure

Qdrant

growing

High-performance open-source vector database written in Rust

Open Source freemium Free Tier
High-performance filtered vector search Single-binary simplicity (no external deps) Multi-tenant applications

Weaviate

established

Open-source vector database with best-in-class hybrid search

Open Source freemium Free Tier
Hybrid search (BM25 + vector with Relative Score Fusion) Built-in vectorization at import time Multi-tenant applications with storage tiering

Milvus

established

Cloud-native vector database for billion-scale enterprise AI

Open Source open-source Free Tier
Billion-scale vector search GPU-accelerated similarity search Maximum index type diversity (12+ options)

Chroma

growing

The developer-first open-source vector database

Open Source freemium Free Tier
Rapid prototyping (3 lines of code to start) Local development and testing Python-native AI/LLM workflows

pgvector

established

Vector similarity search for PostgreSQL

Open Source open-source Free Tier
Adding vectors to existing PostgreSQL Unified SQL + vector queries with JOINs and CTEs Avoiding new infrastructure

LanceDB

growing

Serverless embedded vector database built on Lance columnar format

Open Source freemium Free Tier
Embedded vector search (no server needed) Multimodal data (text, images, video) Cost-efficient storage at scale

Turbopuffer

growing

Object-storage-first vector database with 10–100x cost savings

usage-based
Multi-tenant vector workloads Cost-efficient large-scale search Teams migrating from Pinecone

Vespa

established

Hybrid search and ML serving engine for billion-scale applications

Open Source open-source Free Tier
Hybrid search + custom ML ranking at scale Billion-vector production deployments Real-time ML model inference alongside search

Upstash Vector

growing

Serverless vector database with true pay-per-request pricing

usage-based Free Tier
Serverless and edge architectures Pay-per-request vector search Low-traffic applications with cost sensitivity

1. Pinecone

Fully managed vector database built for scale

Pinecone is the default choice for teams that want managed vector search without ops overhead. The Rust-based serverless architecture separates compute from storage, scaling to zero when idle — great for spiky workloads. The API is the cleanest in the space: shortest path from empty repo to working RAG system. Performance is strong (sub-33ms p99 at 10M vectors, 99.9% recall; VDBBench shows 1,147 QPS at ~$1K/month). A second-generation serverless infrastructure and Dedicated Read Nodes for billion-scale landed in 2025. Supports up to 20,000 dimensions. The tradeoffs are real: closed source with complete vendor lock-in, a $50/month minimum on paid plans (introduced August 2025), and costs that can jump unpredictably — users report going from $50 to $3,000/month as traffic grows. Read unit costs scale linearly with namespace size. Hybrid search requires client-side sparse vector generation. Includes a DeWitt clause prohibiting independent benchmarking. SOC 2 Type II, HIPAA, ISO 27001 compliant. $138M raised at $750M+ valuation; reportedly exploring a sale.

Pros

  • + Zero infrastructure management — true serverless
  • + Excellent query performance (sub-33ms p99 at 10M vectors)
  • + Scales to zero when idle — cost-effective for spiky workloads
  • + Simplest API in the vector database space, SDKs for 6 languages
  • + Strong ecosystem integrations (only vector DB with Terraform/Pulumi providers)
  • + Up to 100,000 namespace-based multi-tenancy per index
  • + SOC 2 Type II, HIPAA, ISO 27001

Cons

  • - Closed source — full vendor lock-in
  • - Read unit costs scale with namespace size (expensive at 100M+ vectors)
  • - $50/month minimum on paid plans
  • - p99 tail latency can spike unpredictably under bursty load
  • - Hybrid search requires client-side sparse vector generation
  • - Cold starts on serverless for infrequently queried indexes
  • - DeWitt clause prohibits independent benchmarking

High-performance open-source vector database written in Rust

Qdrant is the performance-focused choice with the simplest operational profile among purpose-built vector DBs — a single Rust binary with zero external dependencies. Its standout feature is integrated HNSW filtering: filters are applied during graph traversal (not before or after), making it best-in-class for workloads with heavy metadata filtering. Supports scalar, product, and binary quantization (32x memory reduction). Flat latency curve regardless of k (22–24ms from k=1 to k=100, vs Weaviate's linear drift). ~30,000 GitHub stars, 25M+ Docker downloads. Cloud free tier gives 1GB RAM forever. The tradeoff: HNSW-only (no IVF or DiskANN), ingestion can degrade query latency on the same node, and less proven at multi-billion scale than Milvus. $50M Series B raised March 2026.

Pros

  • + Excellent performance (Rust, single binary, zero deps)
  • + Best-in-class filtered search (filters during HNSW traversal)
  • + Binary quantization for 32x memory reduction
  • + Built-in multi-tenancy
  • + Low operational complexity
  • + Embedded mode available for prototyping

Cons

  • - HNSW-only (no IVF or DiskANN alternatives)
  • - Ingestion can degrade query latency on same node
  • - Less proven at multi-billion scale
  • - No built-in vectorization

3. Weaviate

Open-source vector database with best-in-class hybrid search

Weaviate stands out with built-in vectorization modules — you can ingest raw text and images without a separate embedding pipeline. Native hybrid search runs BM25 and dense vector queries in parallel, with Relative Score Fusion that preserves score magnitudes instead of just ordinal ranks. Written in Go with a custom LSM-tree storage engine. Four index types (HNSW, Flat, Dynamic, HFresh) and tenant-aware classes providing hard physical isolation via dedicated shards — far stronger than namespace-based logical separation. Binary Quantization cuts memory costs up to 70%. BlockMax WAND (GA 2025) makes keyword search 10x faster. ~16,000 GitHub stars, 1M+ Docker pulls/month. Flex cloud starts at $45/month. The tradeoff: self-hosted deployments demand real DevOps investment (Kubernetes, sharding, monitoring), HNSW indexes are memory-intensive, and cold-start latency peaks at 1.3s (vs Qdrant's 163ms). ~$68M raised (including $50M Series B, $16M Series A, and seed funding).

Pros

  • + Built-in vectorization — no external embedding pipeline needed
  • + Superior native hybrid search with BM25F and Relative Score Fusion
  • + Open source (BSD 3-Clause) with flexible deployment (cloud, self-hosted, BYOC, embedded)
  • + Hard multi-tenant isolation via dedicated shards per tenant
  • + Binary Quantization cuts memory costs up to 70%
  • + BlockMax WAND makes keyword search 10x faster
  • + Full RBAC, SSO (OIDC), and ACL security model

Cons

  • - Self-hosted deployments require significant DevOps expertise
  • - HNSW indexes are memory-intensive
  • - Cold-start latency peaks at 1.3 seconds
  • - GraphQL/v4 client API has a steeper learning curve than Pinecone
  • - Cloud pricing tiers can be complex to navigate

Cloud-native vector database for billion-scale enterprise AI

Milvus is the most mature and feature-rich open-source vector database. Its C++/Go core with a 4-layer disaggregated architecture (access → coordinator → worker → storage) enables true billion-scale with independent scaling. The only major vector DB with full GPU acceleration via NVIDIA CUDA (CAGRA, GPU_IVF). Widest index selection: 12+ types including HNSW, IVF_FLAT, IVF_PQ, DiskANN, and SCANN. Won BigANN at NeurIPS 2021. VDBBench shows Zilliz Cloud at 9,704 QPS with 2.5ms p99 — top of the leaderboard. ~43,000 GitHub stars, 300+ contributors. The tradeoff: cluster mode requires etcd + MinIO/S3 + Pulsar/Kafka — described by practitioners as significant operational overhead. Milvus Lite runs in-process via pip for prototyping. $113M raised. Linux Foundation graduated project.

Pros

  • + Handles billion-scale datasets (BigANN NeurIPS winner)
  • + Full GPU acceleration (NVIDIA CUDA)
  • + 12+ index types — most diverse selection available
  • + Zilliz Cloud provides managed experience
  • + Milvus Lite for easy prototyping
  • + Linux Foundation graduated project

Cons

  • - Complex self-hosted architecture (etcd + MinIO + Pulsar)
  • - Heavy resource requirements
  • - Schema modifications require complex migrations
  • - VDBBench is Zilliz-maintained (verify independently)

The developer-first open-source vector database

Chroma is the SQLite of vector databases — dead simple to start with, runs in-process, zero config. The 3-lines-of-code developer experience is unmatched. A major Rust core rewrite in 2025 delivered 4x faster performance. Built-in embedding generation via OpenAI, Cohere, and Hugging Face. ~27,000 GitHub stars, 14M+ monthly pip downloads, used in 90K+ open-source codebases. Chroma Cloud (GA) offers serverless with usage-based pricing — a 6M doc workload runs roughly $79/month. Self-hosted via Docker runs on a $5–10/month VPS. The tradeoff: HNSW-only indexing, no built-in RBAC, no horizontal scaling in OSS, and a history of breaking API changes during rapid development. $18M raised.

Pros

  • + Simplest setup of any vector DB (3 lines of code)
  • + Runs in-process, no server needed
  • + Rust core rewrite delivers 4x faster performance
  • + Built-in embedding generation
  • + Massive community (27K+ stars, 14M+ pip downloads)

Cons

  • - HNSW-only indexing
  • - No built-in RBAC
  • - No horizontal scaling in open-source
  • - History of breaking API changes
  • - Performance ceiling for large datasets

6. pgvector

Vector similarity search for PostgreSQL

pgvector's maturation is the biggest shift of 2025. Version 0.8.0 brought iterative scans fixing the notorious overfiltering problem, plus improved cost estimation. The companion pgvectorscale extension adds DiskANN-based indexing with 9x smaller indexes than HNSW. On 50M Cohere embeddings, pgvectorscale achieves 471 QPS at 99% recall with 28ms p95 — beating Pinecone s1 by 16x on throughput at 75% lower cost. The practical scale ceiling is 10–50M vectors on a single instance, extendable to 100M+ with tuning or Citus sharding. The killer advantage is full SQL power: JOINs, CTEs, transactions, and the entire PostgreSQL ecosystem. Available pre-installed on Supabase ($0–25/month), Neon ($0–19/month), AWS RDS/Aurora, AlloyDB, and Timescale Cloud. ~20,400 GitHub stars. Not recommended beyond 100M vectors with low-latency requirements.

Pros

  • + No new infrastructure — just a Postgres extension
  • + Full SQL alongside vector search (JOINs, CTEs, ACID)
  • + Available on every major managed Postgres platform
  • + HNSW + DiskANN indexing (via pgvectorscale)
  • + Huge PostgreSQL ecosystem
  • + Dramatically improved in 2025 (v0.8.0)

Cons

  • - Performance ceiling at 50–100M vectors
  • - HNSW indexes must fit in RAM or latency spikes 10x
  • - No built-in vectorization
  • - Fewer vector-specific features than dedicated solutions

Serverless embedded vector database built on Lance columnar format

LanceDB runs embedded (in-process, no server) on the Lance columnar format — an open-source alternative to Parquet with 100x faster random access. Written in Rust with Python, TypeScript, and Java bindings. Supports IVF_PQ indexing (disk-friendly, scales beyond RAM), automatic dataset versioning with time-travel queries, and multimodal storage. Netflix uses LanceDB for their Media Data Lake; Second Dinner reports 3–5x more cost-effective than alternatives. LanceDB Cloud (GA) charges ~$6 per million 1536D vectors written — one estimate puts a moderate workload at ~$16/month. ~9,700 GitHub stars. $30M raised (June 2025). Best for embedded/edge AI, multimodal data, cost-sensitive workloads, and teams in the Arrow/Pandas/Polars ecosystem.

Pros

  • + Serverless, embedded architecture (zero management)
  • + Lance format: 100x faster random access than Parquet
  • + Native multimodal support (text, images, video)
  • + Automatic dataset versioning with time-travel
  • + Aggressively low cloud pricing

Cons

  • - Cloud platform is relatively new
  • - Smaller community than major competitors
  • - Versioning metadata can affect query performance
  • - Less enterprise track record

8. Turbopuffer

Object-storage-first vector database with 10–100x cost savings

Turbopuffer is the breakout story of 2025. Built from scratch on object storage (S3/GCS), it uses a three-tier cache hierarchy — object storage at ~$0.02/GB, NVMe SSD cache, and RAM cache — delivering 10–100x cost savings over alternatives for multi-tenant workloads. Production customers include Cursor, Anthropic, Notion (migrated from Pinecone), Linear, Atlassian, Ramp, Grammarly, and Superhuman. Handles 2.5T+ documents and 10M+ writes/sec. Not open source and no free tier, but the cost model is genuinely disruptive if you're spending real money on vector search.

Pros

  • + 10–100x cheaper for multi-tenant workloads
  • + Proven at massive scale (2.5T+ documents)
  • + Impressive customer list validates production readiness
  • + SOC 2 and HIPAA compliant
  • + Object-storage architecture keeps costs predictable

Cons

  • - Not open source
  • - No free tier ($64/month minimum)
  • - Newer entrant with smaller community
  • - No self-hosting option

Hybrid search and ML serving engine for billion-scale applications

Vespa is the most underappreciated tool in the vector database landscape. Ranked #1 Leader in GigaOm's 2025 Vector Database Radar, it combines vector search, BM25 text search, tensor operations, and real-time ML model inference in a single Java/C++ distributed platform. It powers Spotify search and Yahoo at billions of vectors. The learning curve is steep — Vespa's configuration model and query language are more complex than competitors — but for teams needing hybrid search with custom ML ranking at massive scale, nothing else comes close.

Pros

  • + Ranked #1 in GigaOm 2025 Vector Database Radar
  • + Proven at billions of vectors (Spotify, Yahoo)
  • + Combines vector search, BM25, and ML inference
  • + Open source (Apache 2.0)
  • + Managed cloud option available

Cons

  • - Steep learning curve
  • - Complex configuration model
  • - Heavier operational footprint than simpler alternatives
  • - Smaller developer community relative to capabilities

10. Upstash Vector

Serverless vector database with true pay-per-request pricing

Upstash Vector offers the only true pay-per-request pricing in the vector database space: $0.40/100K requests with a generous free tier (10K daily queries, 200M max vectors x dimensions). Uses DiskANN for cost-effective disk-based search. Ideal for serverless and edge architectures where you want zero idle costs. The tradeoff is that it's not designed for high-throughput batch workloads or sub-millisecond latency — it's optimized for the long tail of applications that need vector search without paying for always-on compute.

Pros

  • + True pay-per-request pricing
  • + Generous free tier
  • + DiskANN-based for cost efficiency
  • + Great for serverless/edge deployments
  • + Simple REST API

Cons

  • - Not open source
  • - Not ideal for high-throughput batch workloads
  • - Smaller ecosystem than major competitors
  • - No self-hosting option