Half-Precision Vectors with pgvector

Cut pgvector storage in half using halfvec — learn how 16-bit vectors work, when to use them, and how to migrate existing tables.

This post was written by an engineer at QueryPlane. QueryPlane is an app builder for your database: bring your own postgres db and you can create interactive applications to share with other developers, coworkers or even your customers. If you’re interested in trying it out, get started here.

A single 1536-dimensional embedding (the size used by OpenAI’s text-embedding models) takes 6KB of storage. At scale, this adds up: a million vectors requires 6GB before indexes, and indexes roughly double that. pgvector 0.7 introduced halfvec, a half-precision vector type that cuts storage in half with minimal impact on search quality.

In this post, we’ll cover:

How halfvec works - 16-bit vs 32-bit float storage
Storage and performance impact - Real-world benchmarks
When to use halfvec - Decision criteria for your application
Tutorial - Comparing vector vs halfvec storage in practice
Migrating existing tables - Converting your data to halfvec

How halfvec works

Standard vector columns store each dimension as a 32-bit float (4 bytes). halfvec uses 16-bit floats (2 bytes) instead—the same format used in machine learning for training neural networks.

Type	Bytes per dimension	1536-dim vector size
`vector`	4 bytes	6,148 bytes
`halfvec`	2 bytes	3,076 bytes

The 16-bit float format has less precision: it can represent values from about 6×10⁻⁵ to 65,504 with ~3 decimal digits of precision. For normalized embeddings (which most models produce), this is more than sufficient.

Storage and performance impact

Real-world benchmarks show:

50% reduction in table storage
50-66% reduction in index size (East Agile’s tests saw 66% index reduction)
No measurable impact on recall (most tests show less than 1% difference)
Slightly faster queries due to reduced I/O

The storage savings compound: smaller vectors mean smaller indexes, which means more of your index fits in memory, which means fewer disk reads.

When to use halfvec

halfvec is a good default choice for most applications. Consider it when:

Storage costs matter (most production deployments)
Your embeddings come from standard models (OpenAI, Cohere, etc.)
You’re storing millions of vectors

The main reason not to use halfvec is if your application requires the full precision of 32-bit floats—rare for similarity search, but possible for some scientific applications.

Conversion is automatic

When you insert a regular Python list or array into a halfvec column, PostgreSQL handles the conversion automatically. You don’t need to modify your application code:

# This works with both vector and halfvec columns
embedding = [0.1, 0.2, 0.3, ...]
cursor.execute(
    "INSERT INTO items (embedding) VALUES (%s)",
    (embedding,)
)

See what QueryPlane can build for you

Connect to your database, write SQL with AI, and build shareable apps — all from your browser.

Get Started Book a Demo

Tutorial: Comparing vector vs halfvec storage

This tutorial demonstrates the storage difference between vector and halfvec columns.

Prerequisites

Docker installed and running
A terminal

Step 1: Start PostgreSQL with pgvector

docker run -d \
  --name postgres-halfvec \
  -e POSTGRES_PASSWORD=postgres \
  -p 5432:5432 \
  pgvector/pgvector:pg16

Step 2: Connect and enable pgvector

docker exec -it postgres-halfvec psql -U postgres

CREATE EXTENSION IF NOT EXISTS vector;

Step 3: Create tables with both types

-- Standard 32-bit vectors
CREATE TABLE items_vector (
  id BIGSERIAL PRIMARY KEY,
  embedding vector(1536)
);

-- Half-precision 16-bit vectors
CREATE TABLE items_halfvec (
  id BIGSERIAL PRIMARY KEY,
  embedding halfvec(1536)
);

Step 4: Insert identical data into both tables

-- Insert 10,000 random vectors into both tables
INSERT INTO items_vector (embedding)
SELECT
  ('[' || array_to_string(ARRAY(
    SELECT (random())::float4
    FROM generate_series(1, 1536)
  ), ',') || ']')::vector(1536)
FROM generate_series(1, 10000);

INSERT INTO items_halfvec (embedding)
SELECT
  ('[' || array_to_string(ARRAY(
    SELECT (random())::float4
    FROM generate_series(1, 1536)
  ), ',') || ']')::halfvec(1536)
FROM generate_series(1, 10000);

Step 5: Compare table sizes

SELECT
  relname AS table_name,
  pg_size_pretty(pg_total_relation_size(relid)) AS total_size,
  pg_size_pretty(pg_relation_size(relid)) AS table_size
FROM pg_catalog.pg_statio_user_tables
WHERE relname LIKE 'items_%'
ORDER BY relname;

You should see items_halfvec using roughly half the storage of items_vector.

Step 6: Create indexes and compare

-- HNSW index on vector column
CREATE INDEX items_vector_idx ON items_vector
USING hnsw (embedding vector_cosine_ops);

-- HNSW index on halfvec column (note: different operator class)
CREATE INDEX items_halfvec_idx ON items_halfvec
USING hnsw (embedding halfvec_cosine_ops);

Check index sizes:

SELECT
  indexrelname AS index_name,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_catalog.pg_stat_user_indexes
WHERE indexrelname LIKE 'items_%'
ORDER BY indexrelname;

Step 7: Verify search works correctly

-- Search on vector table
SELECT id, embedding <=> (SELECT embedding FROM items_vector LIMIT 1) AS distance
FROM items_vector
ORDER BY distance
LIMIT 5;

-- Search on halfvec table
SELECT id, embedding <=> (SELECT embedding FROM items_halfvec LIMIT 1) AS distance
FROM items_halfvec
ORDER BY distance
LIMIT 5;

Both queries should return results with similar distances and query times.

Step 8: Compare query performance

-- Time a search on the vector table
EXPLAIN ANALYZE
SELECT id FROM items_vector
ORDER BY embedding <=> (SELECT embedding FROM items_vector WHERE id = 1)
LIMIT 10;

-- Time a search on the halfvec table
EXPLAIN ANALYZE
SELECT id FROM items_halfvec
ORDER BY embedding <=> (SELECT embedding FROM items_halfvec WHERE id = 1)
LIMIT 10;

Query times should be similar or slightly faster for halfvec.

Migrating existing tables

To convert an existing vector column to halfvec:

-- Change column type (this rewrites the table)
ALTER TABLE items
  ALTER COLUMN embedding TYPE halfvec(1536);

For large tables, this can take a while since it rewrites every row. Consider:

Creating a new table with halfvec
Copying data in batches
Swapping tables with a rename

-- Create new table
CREATE TABLE items_new (
  id BIGSERIAL PRIMARY KEY,
  embedding halfvec(1536)
);

-- Copy data (do this in batches for large tables)
INSERT INTO items_new (id, embedding)
SELECT id, embedding::halfvec(1536) FROM items;

-- Swap tables
ALTER TABLE items RENAME TO items_old;
ALTER TABLE items_new RENAME TO items;

-- After verifying, drop the old table
DROP TABLE items_old;

Operator classes for halfvec

When creating indexes on halfvec columns, use the halfvec-specific operator classes:

Distance	vector operator class	halfvec operator class
Cosine	`vector_cosine_ops`	`halfvec_cosine_ops`
L2/Euclidean	`vector_l2_ops`	`halfvec_l2_ops`
Inner product	`vector_ip_ops`	`halfvec_ip_ops`

-- Correct
CREATE INDEX ON items USING hnsw (embedding halfvec_cosine_ops);

-- Wrong (will error)
CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops);

Wrapping up

halfvec is a straightforward optimization for most pgvector deployments:

50% storage reduction with minimal impact on search quality
Automatic conversion from standard vectors—no application changes needed
Use halfvec_*_ops operator classes when creating indexes

For new projects, consider starting with halfvec as the default. For existing tables, migrate during a maintenance window using the batch copy approach to avoid extended locks.

Cleanup

docker stop postgres-halfvec
docker rm postgres-halfvec