Half-Precision Vectors with pgvector
Cut pgvector storage in half using halfvec — learn how 16-bit vectors work, when to use them, and how to migrate existing tables.
Postgres
This post was written by an engineer at QueryPlane. QueryPlane is an app builder for your database: bring your own postgres db and you can create interactive applications to share with other developers, coworkers or even your customers. If you’re interested in trying it out, get started here.
A single 1536-dimensional embedding (the size used by OpenAI’s text-embedding models) takes 6KB of storage. At scale, this adds up: a million vectors requires 6GB before indexes, and indexes roughly double that. pgvector 0.7 introduced halfvec, a half-precision vector type that cuts storage in half with minimal impact on search quality.
In this post, we’ll cover:
- How halfvec works - 16-bit vs 32-bit float storage
- Storage and performance impact - Real-world benchmarks
- When to use halfvec - Decision criteria for your application
- Tutorial - Comparing vector vs halfvec storage in practice
- Migrating existing tables - Converting your data to halfvec
How halfvec works
Standard vector columns store each dimension as a 32-bit float (4 bytes). halfvec uses 16-bit floats (2 bytes) instead—the same format used in machine learning for training neural networks.
| Type | Bytes per dimension | 1536-dim vector size |
|---|---|---|
vector | 4 bytes | 6,148 bytes |
halfvec | 2 bytes | 3,076 bytes |
The 16-bit float format has less precision: it can represent values from about 6×10⁻⁵ to 65,504 with ~3 decimal digits of precision. For normalized embeddings (which most models produce), this is more than sufficient.
Storage and performance impact
Real-world benchmarks show:
- 50% reduction in table storage
- 50-66% reduction in index size (East Agile’s tests saw 66% index reduction)
- No measurable impact on recall (most tests show less than 1% difference)
- Slightly faster queries due to reduced I/O
The storage savings compound: smaller vectors mean smaller indexes, which means more of your index fits in memory, which means fewer disk reads.
When to use halfvec
halfvec is a good default choice for most applications. Consider it when:
- Storage costs matter (most production deployments)
- Your embeddings come from standard models (OpenAI, Cohere, etc.)
- You’re storing millions of vectors
The main reason not to use halfvec is if your application requires the full precision of 32-bit floats—rare for similarity search, but possible for some scientific applications.
Conversion is automatic
When you insert a regular Python list or array into a halfvec column, PostgreSQL handles the conversion automatically. You don’t need to modify your application code:
# This works with both vector and halfvec columns
embedding = [0.1, 0.2, 0.3, ...]
cursor.execute(
"INSERT INTO items (embedding) VALUES (%s)",
(embedding,)
)
See what QueryPlane can build for you
Connect to your database, write SQL with AI, and build shareable apps — all from your browser.
Tutorial: Comparing vector vs halfvec storage
This tutorial demonstrates the storage difference between vector and halfvec columns.
Prerequisites
- Docker installed and running
- A terminal
Step 1: Start PostgreSQL with pgvector
docker run -d \
--name postgres-halfvec \
-e POSTGRES_PASSWORD=postgres \
-p 5432:5432 \
pgvector/pgvector:pg16
Step 2: Connect and enable pgvector
docker exec -it postgres-halfvec psql -U postgres
CREATE EXTENSION IF NOT EXISTS vector;
Step 3: Create tables with both types
-- Standard 32-bit vectors
CREATE TABLE items_vector (
id BIGSERIAL PRIMARY KEY,
embedding vector(1536)
);
-- Half-precision 16-bit vectors
CREATE TABLE items_halfvec (
id BIGSERIAL PRIMARY KEY,
embedding halfvec(1536)
);
Step 4: Insert identical data into both tables
-- Insert 10,000 random vectors into both tables
INSERT INTO items_vector (embedding)
SELECT
('[' || array_to_string(ARRAY(
SELECT (random())::float4
FROM generate_series(1, 1536)
), ',') || ']')::vector(1536)
FROM generate_series(1, 10000);
INSERT INTO items_halfvec (embedding)
SELECT
('[' || array_to_string(ARRAY(
SELECT (random())::float4
FROM generate_series(1, 1536)
), ',') || ']')::halfvec(1536)
FROM generate_series(1, 10000);
Step 5: Compare table sizes
SELECT
relname AS table_name,
pg_size_pretty(pg_total_relation_size(relid)) AS total_size,
pg_size_pretty(pg_relation_size(relid)) AS table_size
FROM pg_catalog.pg_statio_user_tables
WHERE relname LIKE 'items_%'
ORDER BY relname;
You should see items_halfvec using roughly half the storage of items_vector.
Step 6: Create indexes and compare
-- HNSW index on vector column
CREATE INDEX items_vector_idx ON items_vector
USING hnsw (embedding vector_cosine_ops);
-- HNSW index on halfvec column (note: different operator class)
CREATE INDEX items_halfvec_idx ON items_halfvec
USING hnsw (embedding halfvec_cosine_ops);
Check index sizes:
SELECT
indexrelname AS index_name,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_catalog.pg_stat_user_indexes
WHERE indexrelname LIKE 'items_%'
ORDER BY indexrelname;
Step 7: Verify search works correctly
-- Search on vector table
SELECT id, embedding <=> (SELECT embedding FROM items_vector LIMIT 1) AS distance
FROM items_vector
ORDER BY distance
LIMIT 5;
-- Search on halfvec table
SELECT id, embedding <=> (SELECT embedding FROM items_halfvec LIMIT 1) AS distance
FROM items_halfvec
ORDER BY distance
LIMIT 5;
Both queries should return results with similar distances and query times.
Step 8: Compare query performance
-- Time a search on the vector table
EXPLAIN ANALYZE
SELECT id FROM items_vector
ORDER BY embedding <=> (SELECT embedding FROM items_vector WHERE id = 1)
LIMIT 10;
-- Time a search on the halfvec table
EXPLAIN ANALYZE
SELECT id FROM items_halfvec
ORDER BY embedding <=> (SELECT embedding FROM items_halfvec WHERE id = 1)
LIMIT 10;
Query times should be similar or slightly faster for halfvec.
Migrating existing tables
To convert an existing vector column to halfvec:
-- Change column type (this rewrites the table)
ALTER TABLE items
ALTER COLUMN embedding TYPE halfvec(1536);
For large tables, this can take a while since it rewrites every row. Consider:
- Creating a new table with
halfvec - Copying data in batches
- Swapping tables with a rename
-- Create new table
CREATE TABLE items_new (
id BIGSERIAL PRIMARY KEY,
embedding halfvec(1536)
);
-- Copy data (do this in batches for large tables)
INSERT INTO items_new (id, embedding)
SELECT id, embedding::halfvec(1536) FROM items;
-- Swap tables
ALTER TABLE items RENAME TO items_old;
ALTER TABLE items_new RENAME TO items;
-- After verifying, drop the old table
DROP TABLE items_old;
Operator classes for halfvec
When creating indexes on halfvec columns, use the halfvec-specific operator classes:
| Distance | vector operator class | halfvec operator class |
|---|---|---|
| Cosine | vector_cosine_ops | halfvec_cosine_ops |
| L2/Euclidean | vector_l2_ops | halfvec_l2_ops |
| Inner product | vector_ip_ops | halfvec_ip_ops |
-- Correct
CREATE INDEX ON items USING hnsw (embedding halfvec_cosine_ops);
-- Wrong (will error)
CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops);
Wrapping up
halfvec is a straightforward optimization for most pgvector deployments:
- 50% storage reduction with minimal impact on search quality
- Automatic conversion from standard vectors—no application changes needed
- Use
halfvec_*_opsoperator classes when creating indexes
For new projects, consider starting with halfvec as the default. For existing tables, migrate during a maintenance window using the batch copy approach to avoid extended locks.
Cleanup
docker stop postgres-halfvec
docker rm postgres-halfvec