Tuning HNSW Indexes in pgvector
Tune pgvector HNSW index parameters — ef_search, m, and ef_construction — with benchmark results for recall vs. speed tradeoffs.
Postgres
This post was written by an engineer at QueryPlane. QueryPlane is an app builder for your database: bring your own postgres db and you can create interactive applications to share with other developers, coworkers or even your customers. If you’re interested in trying it out, get started here.
pgvector’s HNSW index is the recommended index type for vector similarity search in PostgreSQL. It provides high recall and low latency out of the box, but the defaults aren’t always optimal. Three parameters control the tradeoff between recall, speed, and resource usage: ef_search (query-time), m and ef_construction (build-time). This guide explains what each parameter does, how to set them, and shows real benchmark results from tuning them.
In this post, we’ll cover:
ef_search- Tuning query-time accuracy vs speedm- Controlling graph connectivity and index sizeef_construction- Setting build-time quality- Tutorial - Benchmarking recall and speed with different parameter values
- Common gotchas - Why
SHOW hnsw.ef_searchfails and how to fix it
Why SHOW hnsw.ef_search fails
If you’ve tried running SHOW hnsw.ef_search and gotten this error, you’re not alone—it’s one of the most common pgvector questions:
SHOW hnsw.ef_search;
-- ERROR: unrecognized configuration parameter "hnsw.ef_search"
This happens because pgvector uses custom GUC variables that don’t exist in the session until you first SET them. The parameter is real, but PostgreSQL doesn’t know about it until pgvector registers it in your session. To check and set the value:
-- This works: SET the value first
SET hnsw.ef_search = 100;
SHOW hnsw.ef_search;
-- Returns: 100
-- Or use current_setting with a fallback (returns empty string if unset)
SELECT current_setting('hnsw.ef_search', true);
The default value for ef_search is 40 when not explicitly set.
The ef_search parameter
ef_search controls how many candidate vectors HNSW evaluates during a query. A larger value means more candidates are checked, which improves recall (finding the true nearest neighbors) at the cost of query speed.
-- Set for the current session
SET hnsw.ef_search = 100;
-- Set for a single transaction
SET LOCAL hnsw.ef_search = 200;
-- Reset to default
RESET hnsw.ef_search;
The relationship between ef_search and recall is roughly logarithmic—doubling ef_search doesn’t double recall, but it does roughly double query time. Here’s what we measured on 50,000 random 128-dimensional vectors:
| ef_search | Recall@10 | Query Time |
|---|---|---|
| 10 | 30% | 0.38ms |
| 40 (default) | 50% | 1.36ms |
| 100 | 80% | 1.56ms |
| 200 | 100% | 2.29ms |
| 400 | 100% | 3.91ms |
With real embeddings (which have more structure than random vectors), recall is generally higher at every setting. For most production workloads, ef_search between 100 and 200 provides 95%+ recall with sub-5ms queries.
One important constraint: ef_search must be at least as large as the LIMIT in your query. If you set ef_search = 40 but ask for LIMIT 50, you’ll only get 40 rows back.
The m parameter
m controls the maximum number of connections each node maintains in the HNSW graph. It’s set at index creation time and cannot be changed without rebuilding the index.
CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops)
WITH (m = 16); -- default is 16
Higher m means each node is connected to more neighbors, creating a denser graph that’s more likely to find true nearest neighbors. The tradeoff is larger indexes and longer build times:
| m | Index Size | Build Time | Recall@10 (ef_search=40) |
|---|---|---|---|
| 4 | 33 MB | 1.4s | 10% |
| 16 (default) | 40 MB | 4.6s | 50% |
| 32 | 49 MB | 26.6s | 80% |
These benchmarks used 50,000 random 128-dimensional vectors. Index size grows linearly with m because each node stores more edges. Build time grows faster than linearly because the algorithm must evaluate more candidates when connecting each new node.
For most workloads, the default m = 16 is a good balance. Increase to 24 or 32 if you need higher baseline recall and can afford the extra storage. Decrease to 8 or lower only if storage is extremely constrained and you can compensate with a higher ef_search.
See what QueryPlane can build for you
Connect to your database, write SQL with AI, and build shareable apps — all from your browser.
The ef_construction parameter
ef_construction controls how many candidates the algorithm evaluates when building the graph. Higher values produce a better-quality graph at the cost of longer build times.
CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64); -- 64 is the default
Unlike m, ef_construction doesn’t affect index size—it only affects how thoroughly the algorithm searches for optimal connections during construction:
| ef_construction | Index Size | Build Time |
|---|---|---|
| 32 | 40 MB | 2.8s |
| 64 (default) | 40 MB | 4.3s |
| 128 | 40 MB | 6.7s |
| 256 | 40 MB | 12.2s |
Build time scales roughly linearly with ef_construction. The resulting graph quality improves with higher values, but with diminishing returns. For most datasets, ef_construction between 64 and 128 produces a near-optimal graph.
A good rule of thumb from the pgvector documentation: set ef_construction to at least 2 * m.
Tutorial: Benchmarking HNSW parameters
This tutorial walks through measuring recall and speed with different HNSW settings.
Prerequisites
- PostgreSQL with pgvector installed (0.5.0+ for HNSW support)
- A terminal with
psql
Step 1: Create test data
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE items (
id BIGSERIAL PRIMARY KEY,
embedding vector(128)
);
-- Insert 50,000 random 128-dimensional vectors
INSERT INTO items (embedding)
SELECT
array_agg(random())::float4[]::vector(128)
FROM generate_series(1, 50000) AS row_id,
LATERAL generate_series(1, 128) AS dim_id
GROUP BY row_id;
Step 2: Get ground truth with exact search
To measure recall, you need the true nearest neighbors computed via sequential scan (no index):
-- Disable index usage
SET enable_indexscan = off;
SET enable_bitmapscan = off;
-- Find exact 10 nearest neighbors for a query vector
CREATE TEMP TABLE exact_results AS
SELECT id, embedding <=> (SELECT embedding FROM items WHERE id = 1) AS distance
FROM items
WHERE id != 1
ORDER BY distance
LIMIT 10;
-- Re-enable indexes
SET enable_indexscan = on;
SET enable_bitmapscan = on;
SELECT * FROM exact_results;
Step 3: Create an HNSW index and test recall
CREATE INDEX items_hnsw_idx ON items
USING hnsw (embedding vector_cosine_ops);
-- Test with default ef_search
SET hnsw.ef_search = 40;
-- Run the approximate query
EXPLAIN ANALYZE
SELECT id FROM items
WHERE id != 1
ORDER BY embedding <=> (SELECT embedding FROM items WHERE id = 1)
LIMIT 10;
-- Measure recall: how many of the exact results did we find?
SELECT COUNT(*) AS recall_count
FROM (
SELECT id FROM items WHERE id != 1
ORDER BY embedding <=> (SELECT embedding FROM items WHERE id = 1)
LIMIT 10
) approximate
WHERE id IN (SELECT id FROM exact_results);
Step 4: Test different ef_search values
-- Low ef_search (fast, lower recall)
SET hnsw.ef_search = 10;
SELECT COUNT(*) AS recall_10 FROM (
SELECT id FROM items WHERE id != 1
ORDER BY embedding <=> (SELECT embedding FROM items WHERE id = 1) LIMIT 10
) a WHERE id IN (SELECT id FROM exact_results);
-- High ef_search (slower, higher recall)
SET hnsw.ef_search = 200;
SELECT COUNT(*) AS recall_200 FROM (
SELECT id FROM items WHERE id != 1
ORDER BY embedding <=> (SELECT embedding FROM items WHERE id = 1) LIMIT 10
) a WHERE id IN (SELECT id FROM exact_results);
Step 5: Test different m values
To compare m values, you need to drop and recreate the index:
DROP INDEX items_hnsw_idx;
-- Sparse graph
CREATE INDEX items_hnsw_idx ON items
USING hnsw (embedding vector_cosine_ops)
WITH (m = 4, ef_construction = 64);
-- Check index size
SELECT pg_size_pretty(pg_relation_size('items_hnsw_idx'));
-- Test recall
SET hnsw.ef_search = 40;
SELECT COUNT(*) AS recall_m4 FROM (
SELECT id FROM items WHERE id != 1
ORDER BY embedding <=> (SELECT embedding FROM items WHERE id = 1) LIMIT 10
) a WHERE id IN (SELECT id FROM exact_results);
Repeat with m = 16 and m = 32 to see the tradeoff.
Setting parameters per query
You can tune ef_search per query using SET LOCAL inside a transaction, or wrap it in a function:
CREATE OR REPLACE FUNCTION search_high_recall(query_embedding vector(128))
RETURNS TABLE(id bigint, distance float) AS $$
BEGIN
SET LOCAL hnsw.ef_search = 200;
RETURN QUERY
SELECT items.id, items.embedding <=> query_embedding AS distance
FROM items
ORDER BY items.embedding <=> query_embedding
LIMIT 10;
END;
$$ LANGUAGE plpgsql;
This lets you use higher ef_search for critical searches (like user-facing queries) while keeping the default lower for background tasks.
Wrapping up
HNSW has three tuning knobs, each with a distinct role:
ef_search(query-time): Start with the default of 40. Increase to 100-200 for production workloads that need high recall. No index rebuild required.m(build-time): The default of 16 works for most datasets. Increase to 24-32 for higher baseline recall at the cost of larger indexes.ef_construction(build-time): The default of 64 is usually sufficient. Increase to 128 for large datasets where build time isn’t a concern.
Tune ef_search first—it’s the easiest to change and has the biggest impact on recall. Only adjust m and ef_construction if you can’t reach your recall target with ef_search alone.