Using pgvector halfvec in Python

Use pgvector's halfvec type in Python with psycopg2, psycopg3, and SQLAlchemy — save 50% storage with minimal accuracy loss.

This post was written by an engineer at QueryPlane. QueryPlane is an app builder for your database: bring your own postgres db and you can create interactive applications to share with other developers, coworkers or even your customers. If you’re interested in trying it out, get started here.

pgvector’s halfvec type stores vectors in 16-bit floats instead of 32-bit, cutting storage and index size in half with negligible impact on search quality. The Python ecosystem supports halfvec through the pgvector Python package created by Andrew Kane, which works with psycopg2, psycopg3, and SQLAlchemy. This guide covers how to use halfvec from Python with each of these drivers.

In this post, we’ll cover:

Storage savings - Real benchmarks comparing vector vs halfvec
psycopg2 integration - Using HalfVector with the classic driver
psycopg3 integration - Using HalfVector with the modern driver
SQLAlchemy integration - Defining halfvec columns in ORM models
Precision considerations - What 16-bit floats mean for your data

Why halfvec saves you half your storage

A vector(1536) column (the dimension OpenAI’s text-embedding-ada-002 produces) takes ~6KB per row. With halfvec(1536), that drops to ~3KB. In practice, this translates to exactly 50% savings on both table and index storage. Here’s what we measured with 10,000 1536-dimensional vectors:

	vector	halfvec	Savings
Table size	80 MB	40 MB	50%
HNSW index size	78 MB	39 MB	50%
Index build time	9.1s	5.5s	40%

The savings compound at scale. At 10 million vectors, you’re saving ~40GB of storage and proportionally more RAM for index caching. For a deeper look at when and why to use halfvec, see our Half-Precision Vectors with pgvector post.

Setup

Install the pgvector Python package alongside your preferred PostgreSQL driver:

# With psycopg2
pip install pgvector psycopg2-binary

# With psycopg3
pip install pgvector "psycopg[binary]"

# With SQLAlchemy
pip install pgvector sqlalchemy psycopg2-binary

Then create a halfvec column in PostgreSQL:

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE items (
  id BIGSERIAL PRIMARY KEY,
  embedding halfvec(1536)
);

CREATE INDEX ON items
USING hnsw (embedding halfvec_cosine_ops);

Note the operator class: halfvec_cosine_ops instead of vector_cosine_ops. This is a common mistake—using the wrong operator class will cause an error. The full list of halfvec operator classes:

Distance	Operator Class
Cosine	`halfvec_cosine_ops`
L2/Euclidean	`halfvec_l2_ops`
Inner product	`halfvec_ip_ops`

Using halfvec with psycopg2

The pgvector package provides a HalfVector class and a register_vector function that teaches psycopg2 how to serialize and deserialize halfvec values.

import psycopg2
from pgvector.psycopg2 import register_vector, HalfVector

conn = psycopg2.connect("postgresql://localhost/mydb")
register_vector(conn)

cur = conn.cursor()

# Insert using HalfVector
embedding = HalfVector([0.1, 0.2, 0.3])
cur.execute(
    "INSERT INTO items (embedding) VALUES (%s)",
    (embedding,)
)

# Insert from a plain Python list (cast in SQL)
embedding_list = [0.1, 0.2, 0.3]
cur.execute(
    "INSERT INTO items (embedding) VALUES (%s::halfvec)",
    (str(embedding_list).replace(' ', ''),)
)

conn.commit()

When you read halfvec values back, they come as HalfVector objects:

cur.execute("SELECT id, embedding FROM items ORDER BY id LIMIT 3")
for row in cur.fetchall():
    print(row[0], row[1], type(row[1]))
    # 1 HalfVector([0.0999755859375, 0.199951171875, 0.300048828125]) <class 'pgvector.halfvec.HalfVector'>

Notice the slight precision loss—0.1 becomes 0.0999755859375. This is expected behavior with 16-bit floats and doesn’t meaningfully affect similarity search results.

Searching with psycopg2

query = HalfVector([0.1, 0.2, 0.3])
cur.execute("""
    SELECT id, embedding <=> %s AS distance
    FROM items
    ORDER BY embedding <=> %s
    LIMIT 10
""", (query, query))

results = cur.fetchall()
for r in results:
    print(f"id={r[0]}, distance={r[1]}")

Using halfvec with psycopg3 (psycopg)

The psycopg3 integration works the same way, with the import path changed to pgvector.psycopg:

import psycopg
from pgvector.psycopg import register_vector, HalfVector

conn = psycopg.connect("postgresql://localhost/mydb")
register_vector(conn)

cur = conn.cursor()

# Insert
embedding = HalfVector([0.1, 0.2, 0.3])
cur.execute(
    "INSERT INTO items (embedding) VALUES (%s)",
    (embedding,)
)
conn.commit()

# Search
query = HalfVector([0.1, 0.2, 0.3])
cur.execute("""
    SELECT id, embedding <=> %s AS distance
    FROM items
    ORDER BY embedding <=> %s
    LIMIT 10
""", (query, query))

for row in cur.fetchall():
    print(f"id={row[0]}, distance={row[1]}")

See what QueryPlane can build for you

Connect to your database, write SQL with AI, and build shareable apps — all from your browser.

Get Started Book a Demo

Using halfvec with SQLAlchemy

For SQLAlchemy, use the HALFVEC column type from pgvector.sqlalchemy:

from sqlalchemy import create_engine, Column, Integer
from sqlalchemy.orm import declarative_base, Session
from pgvector.sqlalchemy import HALFVEC

Base = declarative_base()

class Item(Base):
    __tablename__ = 'items'
    id = Column(Integer, primary_key=True, autoincrement=True)
    embedding = Column(HALFVEC(1536))

engine = create_engine("postgresql+psycopg2://localhost/mydb")
Base.metadata.create_all(engine)

Inserting and querying works with plain Python lists—SQLAlchemy handles the conversion:

with Session(engine) as session:
    # Insert with a plain list
    item = Item(embedding=[0.1, 0.2, 0.3, ...])
    session.add(item)
    session.commit()

    # Search using cosine distance
    query_vec = [0.1, 0.2, 0.3, ...]
    results = session.query(
        Item.id,
        Item.embedding.cosine_distance(query_vec).label('distance')
    ).order_by('distance').limit(10).all()

    for r in results:
        print(f"id={r.id}, distance={r.distance}")

Other distance methods available on the column: l2_distance(), max_inner_product().

Working with numpy

Most embedding models return numpy arrays. Converting to HalfVector is straightforward:

import numpy as np
from pgvector.psycopg2 import HalfVector

# From a float32 embedding (typical model output)
embedding_f32 = np.array([0.1, 0.2, 0.3], dtype=np.float32)
hv = HalfVector(embedding_f32.tolist())

# From a float16 embedding (if your model supports it)
embedding_f16 = np.array([0.1, 0.2, 0.3], dtype=np.float16)
hv = HalfVector(embedding_f16.tolist())

The conversion from float32 to float16 happens on the PostgreSQL side when you insert into a halfvec column—you don’t need to pre-convert in Python. But if you’re storing vectors in your application before inserting, using np.float16 arrays saves memory on the application side too.

Precision considerations

16-bit floats (IEEE 754 half-precision) have about 3 decimal digits of precision and a range from ~6×10⁻⁵ to 65,504. For normalized embeddings—which most models produce—this precision is more than sufficient.

Where you might notice the difference:

# float32: 0.1
# float16: 0.0999755859375
# Difference: 0.0000244140625

This precision loss is negligible for cosine similarity. Benchmarks from Neon show less than 1% recall difference between vector and halfvec across standard datasets.

Wrapping up

The halfvec type is a straightforward way to cut your pgvector storage in half:

Use HalfVector from the pgvector Python package for psycopg2 and psycopg3
Use HALFVEC(dim) column type for SQLAlchemy models
Use halfvec_cosine_ops (not vector_cosine_ops) when creating indexes
Plain Python lists and numpy arrays convert automatically—no special handling needed

For new projects, halfvec is a good default. The 50% storage savings compound into faster queries, smaller indexes, and lower infrastructure costs with no meaningful impact on search quality.