Menu
Blog Documentation Community Pricing Demo Call Sign Up
Sign Up

An Intro to ClickHouse ReplacingMergeTree

Learn when to use ReplacingMergeTree in ClickHouse, how deduplication actually works, why FINAL matters, and the common schema design mistakes.

ClickHouse

This post was written by an engineer at QueryPlane. QueryPlane is an app builder for your database: bring your own postgres db and you can create interactive applications to share with other developers, coworkers or even your customers. If you’re interested in trying it out, get started here.


ReplacingMergeTree is one of the first ClickHouse features people encounter when they realize ClickHouse does not handle row-by-row updates like a traditional OLTP database. Instead of updating rows in place, a common ClickHouse pattern is to insert newer versions of a row and let background merges keep the latest one.

That sounds simple, but it creates a lot of confusion. Teams expect immediate deduplication, misunderstand what key is used to decide duplicates, or reach for FINAL everywhere and accidentally make queries expensive.

This guide covers the practical model.

In this post, we’ll cover:

  • What ReplacingMergeTree is
  • How deduplication actually works
  • How to choose the right ORDER BY key
  • When to use FINAL
  • Common mistakes

What ReplacingMergeTree does

ReplacingMergeTree is a MergeTree-family engine designed for “replace old version with new version” workloads. Instead of mutating existing rows in place, you insert a new version of the row. During background merges, ClickHouse keeps the latest version and discards older duplicates according to the deduplication key.

That makes it useful for:

  • CDC ingestion
  • last-write-wins entity state
  • upsert-like analytical workflows
  • event streams where newer rows supersede older ones

It is not a magic transactional update system. It is an analytical storage pattern that makes update-like behavior practical in ClickHouse.

The most important concept: deduplication uses ORDER BY

This is the part many teams miss.

In ReplacingMergeTree, deduplication is based on the sorting key defined in ORDER BY, not on some hidden relational primary key concept. ClickHouse’s own data modeling content makes this explicit: the ordering key determines how duplicates are identified for ReplacingMergeTree.

That means your schema design matters a lot. If your ORDER BY does not uniquely identify the logical row you are trying to replace, ReplacingMergeTree will not behave the way you expect.

A simple example

CREATE TABLE orders
(
  order_id UInt64,
  customer_id UInt64,
  status String,
  version UInt64,
  updated_at DateTime
)
ENGINE = ReplacingMergeTree(version)
ORDER BY order_id;

Then write a newer version of a row by inserting again:

INSERT INTO orders VALUES
  (1001, 42, 'pending', 1, now());

INSERT INTO orders VALUES
  (1001, 42, 'paid', 2, now());

Eventually, merges keep the later version for order_id = 1001.

That “eventually” is important.

Deduplication is not always immediate

ReplacingMergeTree relies on background merges. So if you query the table right after inserts, you may still see multiple versions of the same row until merges run.

This is the source of a lot of “ReplacingMergeTree is broken” confusion. The engine is working as designed. The table just has not merged away older versions yet.

If you need query-time deduplication immediately, you can use FINAL:

SELECT *
FROM orders FINAL
WHERE order_id = 1001;

But FINAL should be used carefully because it can make queries more expensive. It is a correctness tool, not something you want to sprinkle over every large analytical query by default.

Choosing the right ORDER BY key

The best design rule is:

  • put the columns that identify the logical row in the deduplication key

For a user profile table, that may be user_id. For an order state table, that may be order_id. For a composite entity, it may be more than one column.

What teams often do wrong is optimize only for query filtering and forget that ReplacingMergeTree also uses the sorting key for deduplication. You need a key that works for both storage efficiency and replacement semantics.

If you need broader query optimization on columns outside the main ordering key, look at ClickHouse projections instead of forcing the wrong deduplication key.

Should you add a version column?

Usually, yes.

A version column makes it explicit which row is newer. Without it, ClickHouse may keep the last inserted row by part order, which is usually not the robustness you want in serious pipelines.

A common pattern is:

ENGINE = ReplacingMergeTree(version)

where version is a monotonic sequence, event timestamp, or CDC version value.

If your source system already provides LSNs, sequence numbers, or monotonically increasing update versions, use them.

When ReplacingMergeTree is a good fit

It works well for:

  • CDC replication from transactional systems
  • latest-state analytical tables
  • dimensions that receive updates over time
  • workloads where inserts are natural and row replacement is eventual

It is less attractive when:

  • you need strict immediate row-level update semantics
  • duplicate visibility between merges is unacceptable
  • the entity key is unclear or unstable

In those cases, reconsider the modeling pattern or use a different ClickHouse engine/workflow.

See what QueryPlane can build for you

Connect to your database, write SQL with AI, and build shareable apps — all from your browser.

Common mistakes

Using the wrong ORDER BY key

This is the biggest one. If ORDER BY does not match the logical uniqueness of the row, deduplication will not reflect your intent.

Expecting merges to behave like transactions

ReplacingMergeTree is eventual with respect to merge-driven deduplication. It is not OLTP semantics.

Using FINAL everywhere

FINAL is useful, but it is not free. Overusing it can erase the performance advantage you were hoping to get from ClickHouse in the first place.

Forgetting that query design still matters

ReplacingMergeTree solves one class of modeling problem. It does not remove the need to think carefully about PARTITION BY, ORDER BY, and the shape of downstream queries.

A good decision rule

Use ReplacingMergeTree when you can honestly say:

  1. Newer rows supersede older rows.
  2. Eventual merge-driven replacement is acceptable.
  3. You can define a stable deduplication key in ORDER BY.

If any of those is false, step back before defaulting to the engine.

Frequently asked questions

What is ReplacingMergeTree in ClickHouse? ReplacingMergeTree is a table engine that deduplicates rows in the background as parts merge, keeping the row with the highest version per ORDER BY key. It is the standard way to model upsert-like behavior in an analytical store that does not natively support row-level updates.

Is ReplacingMergeTree the same as an OLTP upsert? No. The “replace” happens during background merges, not at write time. A query that runs before the next merge will still see both the old and new row unless you explicitly tell it to look only at the latest version (via FINAL, an aggregating query, or argMax). It is upsert-by-eventual-consistency, not transactional upsert.

How do I get the latest row in a ReplacingMergeTree? Either run the query with FINAL (correct but heavier), or write argMax(<col>, <version>) GROUP BY <key> so you collapse to the latest row inline, or use the ORDER BY <key>, <version> DESC LIMIT 1 BY <key> pattern. The right choice depends on how many keys you are reading and whether the cost of FINAL is acceptable.

What is the ver column in ReplacingMergeTree? ver (the version column you pass to the engine: ReplacingMergeTree(ver)) is the tiebreaker between duplicate rows with the same ORDER BY key — the row with the largest ver wins. If you omit it, ClickHouse keeps whichever row is encountered last during the merge, which is non-deterministic. Always specify a version column in production tables.

Does ReplacingMergeTree support deletes? Yes, through the is_deleted parameter (ReplacingMergeTree(ver, is_deleted)). Rows where is_deleted = 1 are dropped during the next merge instead of being kept as the “latest” version. Combine this with FINAL or argMax to query a logically-deleted dataset correctly.

Why are my duplicate rows still visible after an INSERT? Because deduplication happens during merges, not at INSERT time. Newly inserted rows live in their own parts until ClickHouse merges them together, which can take seconds to many minutes depending on part count and merge pressure. If you cannot wait, query with FINAL or rewrite the query to do its own per-key collapse.

Is OPTIMIZE TABLE ... FINAL safe to run in production? It works, but it is expensive and not a replacement for letting background merges run on their own. Each call rewrites entire parts and can starve foreground queries on a busy cluster. Reach for it for one-off remediation (after a backfill, or before publishing to a downstream system), not as a routine job.

Wrapping up

ReplacingMergeTree is one of the most useful ClickHouse engines because it gives you a practical way to model upsert-like analytical data without pretending ClickHouse is an OLTP database. The trick is understanding what it actually does:

  • deduplication is based on ORDER BY
  • merges are eventual
  • FINAL is a correctness escape hatch, not the default

If you are designing these tables from scratch, pair this with our guide to ClickHouse PARTITION BY, ORDER BY, and PRIMARY KEY. And if you want the day-to-day tooling side, see our guide to the best ClickHouse GUI tools.