Top PostgreSQL Extensions - QueryPlane Blog

Discover the most impactful PostgreSQL extensions — pgvector, PostGIS, TimescaleDB, pg_cron, Citus, and more with use cases for each.

This post was written by an engineer at QueryPlane. QueryPlane is an app builder for your database: bring your own postgres db and you can create interactive applications to share with other developers, coworkers or even your customers. If you’re interested in trying it out, get started here.

PostgreSQL’s extension system lets you add capabilities without switching databases. Extensions can add new data types, functions, operators, and even index types. This post covers the most impactful PostgreSQL extensions and when to use them.

In this post, we’ll cover:

pgvector - Vector similarity search for AI/ML
PostGIS - Geospatial data and queries
TimescaleDB - Time-series optimization
pg_cron - Job scheduling in the database
pg_stat_statements - Query performance insights
Citus - Distributed PostgreSQL

pgvector

pgvector adds vector similarity search to PostgreSQL, enabling AI and machine learning applications to store and query embeddings directly in the database.

The extension provides a vector data type for storing embeddings and operators for calculating distances (L2, cosine, inner product). It includes two index types: IVFFlat for faster builds and HNSW for better query performance.

CREATE EXTENSION vector;

CREATE TABLE documents (
  id BIGSERIAL PRIMARY KEY,
  content TEXT,
  embedding vector(1536)
);

-- Create an HNSW index for cosine similarity
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

-- Find similar documents
SELECT content, embedding <=> $1::vector AS distance
FROM documents
ORDER BY embedding <=> $1::vector
LIMIT 5;

pgvector is particularly valuable because it keeps vectors alongside your relational data. You can join similarity search results with user tables, filter by metadata before searching, and maintain ACID guarantees across your entire dataset.

Most managed PostgreSQL providers now support pgvector, including AWS RDS, Google Cloud SQL, Azure, Neon, and Supabase. For AI applications that need vector search without adding a separate vector database, pgvector is often the simplest solution.

PostGIS

PostGIS transforms PostgreSQL into a spatial database. It adds geometry and geography types, spatial functions, and spatial indexes that enable location-based applications.

The extension supports points, lines, polygons, and complex geometries. You can calculate distances, find intersections, compute areas, and perform dozens of other spatial operations directly in SQL.

CREATE EXTENSION postgis;

CREATE TABLE locations (
  id SERIAL PRIMARY KEY,
  name TEXT,
  coordinates GEOGRAPHY(POINT, 4326)
);

-- Create a spatial index
CREATE INDEX ON locations USING GIST (coordinates);

-- Find locations within 5km of a point
SELECT name, ST_Distance(
  coordinates,
  ST_MakePoint(-122.4194, 37.7749)::geography
) AS distance
FROM locations
WHERE ST_DWithin(
  coordinates,
  ST_MakePoint(-122.4194, 37.7749)::geography,
  5000  -- 5km in meters
)
ORDER BY distance;

PostGIS is the standard for geospatial data in PostgreSQL. It’s used in mapping applications, logistics systems, real estate platforms, and anywhere location matters. The spatial indexes (GIST and SP-GIST) make queries fast even with millions of geometries.

The extension also handles coordinate reference systems, raster data, topology, and 3D geometries. For specialized use cases like routing, there’s pgRouting which builds on PostGIS.

TimescaleDB

TimescaleDB optimizes PostgreSQL for time-series data. It automatically partitions tables by time (called hypertables), enabling fast inserts and queries on time-bounded data.

The extension handles the complexity of partitioning transparently. You create a hypertable from a regular table, and TimescaleDB manages the underlying chunks automatically. Queries that filter by time only scan relevant chunks, improving performance dramatically.

CREATE EXTENSION timescaledb;

CREATE TABLE metrics (
  time TIMESTAMPTZ NOT NULL,
  device_id INTEGER,
  temperature DOUBLE PRECISION,
  humidity DOUBLE PRECISION
);

-- Convert to hypertable, partitioned by week
SELECT create_hypertable('metrics', 'time', chunk_time_interval => INTERVAL '7 days');

-- Insert data normally
INSERT INTO metrics VALUES (NOW(), 1, 22.5, 45.0);

-- Time-based queries are fast
SELECT time_bucket('1 hour', time) AS hour,
       AVG(temperature) AS avg_temp
FROM metrics
WHERE time > NOW() - INTERVAL '7 days'
GROUP BY hour
ORDER BY hour;

TimescaleDB includes compression that can reduce storage by 90%+ for time-series data. It also provides continuous aggregates (pre-computed rollups), data retention policies, and real-time analytics functions.

For IoT data, application metrics, financial data, or any time-ordered workload, TimescaleDB provides significant performance improvements over unpartitioned PostgreSQL tables.

pg_cron

pg_cron runs scheduled jobs inside the database using familiar cron syntax. It’s useful for maintenance tasks, data aggregation, and periodic cleanup without external scheduling systems.

CREATE EXTENSION pg_cron;

-- Delete old sessions every hour
SELECT cron.schedule('0 * * * *', $$DELETE FROM sessions WHERE created_at < NOW() - INTERVAL '30 days'$$);

-- Refresh a materialized view daily at 3am
SELECT cron.schedule('0 3 * * *', $$REFRESH MATERIALIZED VIEW daily_stats$$);

-- Run vacuum on a large table weekly
SELECT cron.schedule('0 4 * * 0', $$VACUUM ANALYZE large_table$$);

-- List scheduled jobs
SELECT * FROM cron.job;

pg_cron jobs run in the database context, so they can access all your tables and functions. This is simpler than setting up external cron jobs or scheduled Lambda functions for database maintenance.

The extension is available on AWS RDS, Google Cloud SQL, Azure, and other managed providers. Jobs run on a single designated database, so in a replicated setup, they only execute on the primary.

See what QueryPlane can build for you

Connect to your database, write SQL with AI, and build shareable apps — all from your browser.

Get Started Book a Demo

pg_stat_statements

pg_stat_statements tracks execution statistics for all SQL statements. It’s essential for identifying slow queries, finding optimization opportunities, and understanding database workload.

CREATE EXTENSION pg_stat_statements;

-- Find slowest queries by total time
SELECT query,
       calls,
       total_exec_time / 1000 AS total_seconds,
       mean_exec_time AS avg_ms,
       rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

-- Find queries with worst hit ratio (reading from disk)
SELECT query,
       calls,
       shared_blks_hit,
       shared_blks_read,
       ROUND(shared_blks_hit::numeric / NULLIF(shared_blks_hit + shared_blks_read, 0) * 100, 2) AS hit_ratio
FROM pg_stat_statements
WHERE calls > 100
ORDER BY hit_ratio ASC
LIMIT 10;

The extension normalizes queries by replacing literal values with parameters, so SELECT * FROM users WHERE id = 1 and SELECT * FROM users WHERE id = 2 are grouped together. This gives you accurate statistics across all executions of the same query pattern.

pg_stat_statements is a built-in extension (contrib module) and is available on virtually all PostgreSQL deployments. Most monitoring tools (pganalyze, Datadog, etc.) rely on it for query performance insights. Enabling it has minimal performance overhead and should be standard for any production database.

Citus

Citus distributes PostgreSQL across multiple nodes, enabling horizontal scaling for both multi-tenant applications and real-time analytics.

The extension shards tables across worker nodes by a distribution column. For multi-tenant apps, you typically distribute by tenant_id, co-locating each tenant’s data on a single node. For analytics, you might distribute by a time column.

CREATE EXTENSION citus;

-- Add worker nodes
SELECT citus_add_node('worker1', 5432);
SELECT citus_add_node('worker2', 5432);

-- Distribute a table by tenant_id
SELECT create_distributed_table('orders', 'tenant_id');

-- Queries that filter by tenant_id run on a single node
SELECT * FROM orders WHERE tenant_id = 123;

-- Cross-tenant queries parallelize across nodes
SELECT DATE_TRUNC('day', created_at), COUNT(*)
FROM orders
GROUP BY 1;

Citus handles query routing automatically. Queries that include the distribution column filter run on a single shard. Queries that need data from multiple shards parallelize across workers and aggregate results.

The extension is particularly powerful for multi-tenant SaaS applications. By distributing on tenant_id and using reference tables for shared data, you can scale to thousands of tenants while keeping PostgreSQL’s familiar query interface.

Citus is available as a managed service (Azure Cosmos DB for PostgreSQL, Citus Cloud) or can be self-hosted. The extension is open source under AGPL.

Other notable extensions

uuid-ossp and pgcrypto: UUID generation and cryptographic functions. Most applications need at least one of these.

CREATE EXTENSION "uuid-ossp";
SELECT uuid_generate_v4();

hstore: Key-value store within a column. Useful for flexible metadata before reaching for JSONB.

pg_trgm: Trigram-based text similarity and fuzzy matching. Enables fast LIKE queries and typo-tolerant search.

ltree: Hierarchical data using label trees. Efficient for category trees, organizational charts, and threaded comments.

postgres_fdw: Query other PostgreSQL databases as foreign tables. Useful for federation and gradual migrations.

Choosing extensions

Extension	Use case
pgvector	AI/ML embeddings, similarity search
PostGIS	Geospatial data, location queries
TimescaleDB	Time-series data, IoT, metrics
pg_cron	Scheduled jobs, maintenance tasks
pg_stat_statements	Query performance monitoring
Citus	Horizontal scaling, multi-tenant apps

Wrapping up

Extensions are one of PostgreSQL’s greatest strengths. Instead of switching to a specialized database for vectors, geospatial, or time-series data, you can often add an extension and keep everything in PostgreSQL.

Before adopting an extension, check that it’s supported by your hosting provider (managed services vary in extension support) and consider the operational implications. Most of the extensions above are mature, widely deployed, and safe to use in production.

The right combination of extensions can handle surprisingly diverse workloads in a single database, simplifying your architecture and keeping your data together.