Menu
Blog Documentation Community Pricing Demo Call Sign Up
Sign Up

Choosing the Right Primary Key Strategy in PostgreSQL

Learn when to use natural keys, surrogate keys, and composite keys in PostgreSQL, with practical examples for each strategy.

Postgres

This post was written by an engineer at QueryPlane. QueryPlane is an app builder for your database: bring your own postgres db and you can create interactive applications to share with other developers, coworkers or even your customers. If you’re interested in trying it out, get started here.


Every table needs a primary key, but not every table needs the same kind. This post helps you choose between natural keys, surrogate keys, and composite keys—and when each makes sense.

Natural Keys: Real-World Identifiers

A natural key uses existing data that uniquely identifies a row. Instead of generating a synthetic identifier, you rely on something that already exists in the real world:

CREATE TABLE countries (
  iso_code CHAR(2) PRIMARY KEY,  -- Natural key: ISO 3166-1 alpha-2
  name TEXT NOT NULL,
  population BIGINT
);

CREATE TABLE users (
  email TEXT PRIMARY KEY,  -- Natural key: email address
  name TEXT NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

Natural keys work well when the identifier is truly immutable and comes from an authoritative external source. ISO country codes are a good example—they’re standardized, stable, and universally recognized. When you see US or CA, you immediately know what country is being referenced without needing to look up an arbitrary ID.

However, natural keys can cause serious problems when the underlying data changes. Consider using email as a primary key:

CREATE TABLE users (
  email TEXT PRIMARY KEY,
  name TEXT NOT NULL
);

CREATE TABLE orders (
  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  user_email TEXT REFERENCES users(email),
  total_cents BIGINT NOT NULL
);

-- This seems fine until a user wants to change their email
UPDATE users SET email = 'new@example.com' WHERE email = 'old@example.com';
-- ERROR: update or delete on table "users" violates foreign key constraint

Now you’re stuck. Every table that references this user by email needs to be updated, and if you have cascading deletes disabled (which you probably should), this becomes a manual, error-prone process.

Beyond mutability, natural keys have other pitfalls. They can be long and complex, making indexes larger and joins slower. Using something like a Social Security Number or email as a key means that sensitive data appears in foreign keys throughout your schema, creating privacy concerns. And if you’re relying on an external system’s format—like a third-party API’s identifier—you’re at the mercy of that system’s stability.

When to use natural keys: Reserve them for truly stable, authoritative identifiers like ISO codes, or for small lookup tables where the natural identifier is the most meaningful way to reference rows. For most application tables, you’re better off with a surrogate key.

Surrogate Keys: Synthetic Identifiers

A surrogate key is a meaningless identifier created solely to identify rows. It has no business meaning—it’s just a unique number that your database generates:

CREATE TABLE users (
  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  email TEXT NOT NULL UNIQUE,  -- Uniqueness enforced, but not the PK
  name TEXT NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE orders (
  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  user_id BIGINT NOT NULL REFERENCES users(id),  -- Simple, stable FK
  total_cents BIGINT NOT NULL
);

The key insight here is that surrogate keys separate the concept of identity from the data itself. The id column will never change, regardless of what happens to the user’s email, name, or any other attribute. This makes foreign key relationships stable and simple—you’re always joining on a single, compact integer column.

Surrogate keys also keep business data out of your keys. When you’re debugging a production issue and see user_id = 12345 in your logs, there’s no risk of accidentally exposing someone’s email address. The ID is just a number.

The main criticism of surrogate keys is that they add an extra column that takes up space and “means nothing.” In practice, the space overhead is negligible (8 bytes per row for a BIGINT), and the lack of meaning is actually a feature—it’s what makes the key stable.

However, surrogate keys require discipline. Because the ID doesn’t inherently represent anything, you can accidentally insert duplicate records if you forget to add uniqueness constraints on the natural identifiers:

-- Surrogate key without proper constraints
CREATE TABLE countries (
  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  iso_code CHAR(2),  -- Oops, no UNIQUE constraint
  name TEXT
);

-- Now you can insert the same country twice
INSERT INTO countries (iso_code, name) VALUES ('US', 'United States');
INSERT INTO countries (iso_code, name) VALUES ('US', 'USA');  -- Duplicate!

The solution is simple: always add unique constraints on your natural identifiers, even when using a surrogate primary key:

CREATE TABLE countries (
  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  iso_code CHAR(2) NOT NULL UNIQUE,  -- Surrogate PK + natural key constraint
  name TEXT NOT NULL
);

When to use surrogate keys: For the vast majority of tables. If you need distributed ID generation across multiple databases or client-side generation, use UUID v7 or ULID instead of BIGINT. If the ID will be exposed to users in URLs or APIs, consider adding a separate public_id column with a more opaque format, or use UUID/ULID as the primary key directly.

Composite Keys: Multiple Columns

Composite keys use multiple columns together as the primary key. The combination of values uniquely identifies a row, even though no single column does:

-- Junction table for many-to-many relationship
CREATE TABLE user_roles (
  user_id BIGINT REFERENCES users(id),
  role_id BIGINT REFERENCES roles(id),
  granted_at TIMESTAMPTZ DEFAULT NOW(),
  PRIMARY KEY (user_id, role_id)
);

-- Time-series data
CREATE TABLE sensor_readings (
  sensor_id BIGINT REFERENCES sensors(id),
  recorded_at TIMESTAMPTZ,
  value NUMERIC NOT NULL,
  PRIMARY KEY (sensor_id, recorded_at)
);

Composite keys make the most sense in junction tables (the tables that implement many-to-many relationships) and time-series data. In these cases, the combination of columns is the natural identity of the row. A user_roles row represents “this user has this role”—the pair (user_id, role_id) is exactly what defines the record.

The challenge with composite keys comes when other tables need to reference them. Foreign keys to composite primary keys require listing all the columns:

CREATE TABLE user_role_permissions (
  user_id BIGINT,
  role_id BIGINT,
  permission_id BIGINT REFERENCES permissions(id),
  PRIMARY KEY (user_id, role_id, permission_id),
  FOREIGN KEY (user_id, role_id) REFERENCES user_roles(user_id, role_id)
);

This gets verbose quickly, and it’s easy to make mistakes. If you find yourself needing to reference a composite-keyed table from multiple places, consider adding a surrogate key while keeping the uniqueness constraint:

CREATE TABLE user_roles (
  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  user_id BIGINT NOT NULL REFERENCES users(id),
  role_id BIGINT NOT NULL REFERENCES roles(id),
  granted_at TIMESTAMPTZ DEFAULT NOW(),
  UNIQUE (user_id, role_id)  -- Still enforce uniqueness
);

-- Now foreign keys are simpler
CREATE TABLE audit_log (
  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  user_role_id BIGINT REFERENCES user_roles(id),  -- Single column
  action TEXT NOT NULL,
  performed_at TIMESTAMPTZ DEFAULT NOW()
);

When to use composite keys: Junction tables that won’t be referenced elsewhere, and time-series tables where the entity-plus-timestamp combination is the natural identity. If the table will be referenced by foreign keys from other tables, add a surrogate key to simplify those references.

See what QueryPlane can build for you

Connect to your database, write SQL with AI, and build shareable apps — all from your browser.

Common Patterns

The Standard Table

Most tables in your application should follow this pattern—a surrogate primary key with unique constraints on any natural identifiers:

CREATE TABLE products (
  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  sku TEXT NOT NULL UNIQUE,     -- Natural identifier, but not PK
  name TEXT NOT NULL,
  price_cents BIGINT NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW(),
  updated_at TIMESTAMPTZ DEFAULT NOW()
);

Internal + External IDs

For user-facing entities where you don’t want to expose sequential integers in your API, maintain both an internal ID for database operations and a public ID for external use:

CREATE TABLE orders (
  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  public_id TEXT NOT NULL UNIQUE DEFAULT 'ord_' || nanoid(12),
  customer_id BIGINT NOT NULL REFERENCES customers(id),
  total_cents BIGINT NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX ON orders (public_id);

Internal queries use the efficient integer id, while your API exposes the opaque public_id.

Simple Junction Tables

For many-to-many relationships that don’t need to be referenced elsewhere, a composite key is clean and appropriate:

CREATE TABLE product_categories (
  product_id BIGINT REFERENCES products(id) ON DELETE CASCADE,
  category_id BIGINT REFERENCES categories(id) ON DELETE CASCADE,
  PRIMARY KEY (product_id, category_id)
);

Referenced Junction Tables

When your junction table will be referenced by other tables (like a grades table referencing enrollments), add a surrogate key:

CREATE TABLE enrollments (
  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  student_id BIGINT NOT NULL REFERENCES students(id),
  course_id BIGINT NOT NULL REFERENCES courses(id),
  enrolled_at TIMESTAMPTZ DEFAULT NOW(),
  UNIQUE (student_id, course_id)
);

CREATE TABLE grades (
  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  enrollment_id BIGINT NOT NULL REFERENCES enrollments(id),
  assignment_id BIGINT NOT NULL REFERENCES assignments(id),
  score NUMERIC NOT NULL,
  UNIQUE (enrollment_id, assignment_id)
);

Reference/Lookup Tables

For small, stable reference data, you can go either way. Natural keys work well when the identifier is meaningful and stable:

CREATE TABLE currencies (
  code CHAR(3) PRIMARY KEY,  -- ISO 4217
  name TEXT NOT NULL,
  symbol TEXT
);

Or use a surrogate key if you prefer consistency across your schema:

CREATE TABLE currencies (
  id SMALLINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  code CHAR(3) NOT NULL UNIQUE,
  name TEXT NOT NULL,
  symbol TEXT
);

Time-Series Data

Composite keys are natural for time-series data, where each reading is identified by the source and timestamp:

CREATE TABLE metrics (
  metric_name TEXT,
  recorded_at TIMESTAMPTZ,
  value DOUBLE PRECISION NOT NULL,
  PRIMARY KEY (metric_name, recorded_at)
) PARTITION BY RANGE (recorded_at);

Anti-Patterns to Avoid

No primary key at all. Every table needs a way to uniquely identify rows. Without a primary key, you can’t reliably update or delete specific records:

-- Never do this
CREATE TABLE events (
  event_type TEXT,
  payload JSONB,
  created_at TIMESTAMPTZ
);

Composite keys with too many columns. If your primary key has more than two or three columns, it’s probably time to add a surrogate key instead. Five-column composite keys are painful to work with and error-prone:

-- Just add a surrogate key instead
CREATE TABLE order_items (
  order_id BIGINT,
  customer_id BIGINT,
  product_id BIGINT,
  variant_id BIGINT,
  warehouse_id BIGINT,
  PRIMARY KEY (order_id, customer_id, product_id, variant_id, warehouse_id)
);

Surrogate keys without uniqueness constraints. If you use a surrogate key, you still need to enforce uniqueness on the natural identifiers. Otherwise, you’ll end up with duplicate records:

-- Missing UNIQUE constraint allows duplicates
CREATE TABLE users (
  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  email TEXT NOT NULL  -- Should be UNIQUE!
);

Natural keys that change. Using mutable data like email addresses as primary keys creates cascading update nightmares. If the data might ever change, use a surrogate key.

Wrapping Up

For most tables, the answer is straightforward: use a surrogate key (BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY) and add unique constraints on any natural identifiers. This gives you stable foreign keys, simple queries, and the flexibility to handle real-world changes like users updating their email addresses.

Reserve natural keys for truly stable, authoritative identifiers like ISO codes. Use composite keys for junction tables that won’t be referenced elsewhere. And when in doubt, err on the side of a surrogate key—it’s much easier to add a unique constraint later than to migrate away from a natural key that turned out to be mutable.