ClickHouse Materialized Views in Production
How ClickHouse materialized views, AggregatingMergeTree, and async inserts fit together — what to build, what breaks, and the schema design that survives real workloads.
ClickHouse
This post was written by an engineer at QueryPlane. QueryPlane is an app builder for your database: bring your own postgres db and you can create interactive applications to share with other developers, coworkers or even your customers. If you’re interested in trying it out, get started here.
A ClickHouse materialized view is the feature people reach for when they realize that the same GROUP BY is running on every dashboard, scanning the same billion rows, and burning the same CPU every time. The fix is to compute the aggregate once at insert time and read the pre-aggregated state. It is the highest-leverage performance feature ClickHouse ships, and it is also the one that bites teams hardest in production — because a materialized view in ClickHouse is not what the SQL standard calls a materialized view, and the rules for how it behaves under load are surprising the first few times.
This post is for the moment a team has chosen ClickHouse, has a fact table large enough to feel the cost of full scans, and is trying to decide what to materialize, how to land data into it, and how to keep the rollups honest as the table changes.
In this post, we’ll cover:
- What a ClickHouse materialized view actually is — an insert trigger, not a maintained query
AggregatingMergeTreeand-Statefunctions — the right destination engine for rollupsSummingMergeTreeandReplacingMergeTree— when the simpler engines are enough- Async inserts — the ingestion pattern materialized views depend on
- Backfilling from existing data —
POPULATEand the safer alternative - The pitfalls — duplicates on retries, dropped rows on schema mismatches, and the
FINALtrap
What a ClickHouse materialized view is — and is not
In Postgres or Oracle, a materialized view is a snapshot of a query result that you REFRESH on a schedule. In ClickHouse, a materialized view is something different and easier to get wrong: it is an insert trigger that runs a transformation on every block written to a source table and pushes the result into a destination table.
CREATE MATERIALIZED VIEW events_daily_mv
TO events_daily
AS
SELECT
toDate(event_time) AS event_date,
tenant_id,
countState() AS event_count_state,
uniqState(user_id) AS unique_users_state
FROM events
GROUP BY event_date, tenant_id;
The TO events_daily clause is the part people miss the first time. The view itself stores nothing. The aggregated rows land in events_daily, which is a normal MergeTree-family table you read at query time. The view’s only job is to transform incoming blocks from events and write them into the destination.
This has three consequences that everything else in this post follows from.
First, the view runs only on new inserts. Updates and deletes against events do not flow through. If you run a DELETE FROM events WHERE ... or an ALTER TABLE events UPDATE, the rollup in events_daily does not change. The rollup is a write-time projection of inserts, not a live query.
Second, the view sees the block the writer inserts, not the full table. If you insert a million rows in one block, the view’s GROUP BY runs over that million-row block in isolation, produces partial aggregates, and writes them to events_daily. The destination table then merges those partials with other partials in the background. The rollup is “eventually correct” only if you use an engine that knows how to combine partial states — which is what AggregatingMergeTree is for.
Third, if the destination table’s schema does not match the view’s SELECT, ClickHouse will silently drop or misroute columns. This is the single most common production incident with materialized views.
A useful mental model: a ClickHouse materialized view is INSERT INTO destination SELECT ... FROM source running automatically on every block inserted into the source. If you can think about it that way, the rest follows.
AggregatingMergeTree and the -State / -Merge pattern
For rollups, the destination table is almost always AggregatingMergeTree. The reason is that a block-by-block insert produces partial aggregates, and AggregatingMergeTree knows how to merge those partials during background compaction.
The schema convention is AggregateFunction(count) for count, AggregateFunction(uniq, UInt64) for uniqExact, AggregateFunction(sum, Float64) for sum, and so on. The view writes the State version of each aggregate (countState(), uniqState(user_id), sumState(amount)), and the destination column types match.
CREATE TABLE events_daily
(
event_date Date,
tenant_id UInt64,
event_count_state AggregateFunction(count),
unique_users_state AggregateFunction(uniq, UInt64)
)
ENGINE = AggregatingMergeTree
ORDER BY (event_date, tenant_id);
At read time, you wrap each state column in its -Merge form to collapse the partial states into the final scalar value:
SELECT
event_date,
tenant_id,
countMerge(event_count_state) AS events,
uniqMerge(unique_users_state) AS unique_users
FROM events_daily
WHERE event_date >= today() - 30
GROUP BY event_date, tenant_id
ORDER BY event_date;
Two things are easy to miss here. The GROUP BY at read time is doing the final aggregation across the partial states stored in the table — it is not redundant; it is essential. And uniqState/uniqMerge use adaptive sampling (linear counting for small cardinalities, a logarithmic structure beyond that — the explicit HyperLogLog implementations are the separate uniqHLL12 and uniqCombined families), which means your unique_users count is approximate. If you need exact distinct counts, you have to either use uniqExactState (much larger sketches, accurate up to memory) or denormalize differently.
The AggregatingMergeTree family is the right answer when you want pre-aggregated rollups that combine across many small inserts and still produce the right scalar value at read time. Almost every “this dashboard is too expensive” problem in a moderate-size ClickHouse deployment is solved by adding one.
SummingMergeTree and ReplacingMergeTree: simpler answers when they fit
AggregatingMergeTree is the general tool, but two specialized engines handle common cases with less ceremony.
SummingMergeTree sums numeric columns across rows with the same sort key. The destination table looks like a regular MergeTree, the view writes plain integers or floats (no -State wrapping), and the engine handles the addition during merges:
CREATE TABLE revenue_daily
(
event_date Date,
tenant_id UInt64,
amount Float64,
quantity UInt64
)
ENGINE = SummingMergeTree
ORDER BY (event_date, tenant_id);
The view simply writes SUM(amount) and SUM(quantity) grouped by (event_date, tenant_id), and the engine keeps adding the values whenever it merges parts. The catch is that until the merge happens, the table has multiple partial rows for the same key. You still need a SELECT ... SUM(amount) FROM ... GROUP BY ... at read time to get the right total — SummingMergeTree does not make the partials disappear, it just lets the engine collapse them in the background. For counts and sums where approximate uniques are not needed, SummingMergeTree is the cleaner choice.
ReplacingMergeTree is the right answer when the rollup is “the latest version of a row per key” rather than an aggregate. CDC-style projections, latest-status-per-user tables, and most “denormalized join target” tables fit this shape. We covered it in detail in our ReplacingMergeTree guide; the relevant point here is that you can target a ReplacingMergeTree from a materialized view to maintain a flattened, denormalized version of a normalized source.
Reach for AggregatingMergeTree when the rollup uses functions beyond sum and count — uniq, argMax, quantile, topK, anything stateful. Reach for SummingMergeTree when the math is plain addition. Reach for ReplacingMergeTree when there is no math, only “latest wins.”
Async inserts: the ingestion pattern the views depend on
A materialized view processes the block as it lands. If your ingest pattern is “10,000 separate INSERT statements per second, one row each,” every one of those rows creates a separate part in the source table and a separate group in the materialized view, and your part count explodes. ClickHouse’s merge machinery cannot keep up, and you get the classic Too many parts error within minutes.
The fix is to either batch writes on the client side or use async inserts, introduced for exactly this pattern. With async inserts enabled, the server buffers small inserts in memory, combines them into larger blocks, and flushes the combined block as a single part. The materialized view then sees one block per flush, not one block per client INSERT.
SET async_insert = 1;
SET wait_for_async_insert = 1; -- block until flush, so the client knows it's durable
INSERT INTO events VALUES (...);
Two settings drive the behavior. async_insert_busy_timeout_ms (default 1,000) controls the maximum time a buffer sits before being flushed. async_insert_max_data_size (default ~10 MB) controls the maximum buffered bytes. Tune them together: shorter timeout for lower visibility lag, larger size for better part economics. The HTTP Async-Insert header is the way most application code controls this per-request rather than per-session.
wait_for_async_insert = 1 is the option most people forget on the first try. With it off, the client gets a 200 OK as soon as the row hits the buffer, before the row is durable. With it on, the client waits for the buffer to flush and the part to land. For most production write paths, you want it on — durability matters more than the few milliseconds it adds.
Async inserts and materialized views are a package deal. If you are using one, you almost certainly want the other.
See what QueryPlane can build for you
Connect to your database, write SQL with AI, and build shareable apps — all from your browser.
Backfilling: POPULATE and the safer pattern
You almost never want POPULATE. It exists, the documentation mentions it, and using it on a production table is the fastest way to a 4 AM incident.
POPULATE is a flag on CREATE MATERIALIZED VIEW that tells ClickHouse to read the entire source table once at creation time and feed it through the view’s SELECT. That sounds convenient, but it has two failure modes that are hard to recover from. First, while POPULATE is running, new inserts into the source table are not routed through the view. Any row that lands during the backfill is silently missed in the rollup. Second, POPULATE produces a single multi-hour transaction-like operation that you cannot pause, resume, or partially retry.
The safer pattern is:
- Create the destination
AggregatingMergeTree(orSummingMergeTree) table empty. - Create the materialized view without
POPULATE. From this moment, all new inserts are correctly routed. - Run a controlled, chunked
INSERT INTO destination SELECT ... FROM source WHERE event_date BETWEEN ... AND ...for each historical window. Use the sameSELECTas the view’s body.
That gives you a backfill you can resume by date range, retry by date range, and verify by date range. It also keeps the live ingest correct from the moment the view exists.
The exception: if the source table is small (under a few hundred million rows) and you can stop writes for a few minutes, POPULATE is fine. Otherwise treat it as a footgun.
Schema mismatches and silent column drops
The single most common production incident with materialized views is a schema mismatch between the view’s SELECT and the destination table. ClickHouse matches columns by position, not by name, when the view targets a TO destination table. If the view’s third column is event_count_state and the destination’s third column is unique_users_state, ClickHouse will happily route the event_count_state values into the unique_users_state column, with no warning.
The defense is to always alias every column in the view’s SELECT to match the destination’s column names exactly, and to keep the column order in the SELECT identical to the destination’s CREATE TABLE. A code review checklist that explicitly compares the two helps; an integration test that inserts a known row and asserts the read-side result is even better.
A related variant: when you ALTER TABLE destination ADD COLUMN, the view’s SELECT no longer produces a value for the new column, and ClickHouse fills it with the column’s default. If the default is 0, the rollup looks correct but is silently empty in the new column. The fix is to update the view’s SELECT (or replace the view, which is also a non-trivial operation) at the same time you alter the destination.
When FINAL is the wrong instinct
If your destination is AggregatingMergeTree or ReplacingMergeTree, you will eventually be tempted to query it with FINAL to “make the merge happen at read time.”
-- Don't do this on a hot dashboard
SELECT * FROM events_daily FINAL WHERE event_date = today();
FINAL forces the engine to merge all parts for the query’s scope at read time, which can be hundreds of megabytes of compute per query. On a frequently-hit dashboard, it will dominate cost.
For AggregatingMergeTree, you do not need FINAL at all — the -Merge aggregate functions in your SELECT already collapse partial states correctly. The query SELECT event_date, countMerge(event_count_state) FROM events_daily GROUP BY event_date produces the right answer whether or not background merges have caught up.
For SummingMergeTree, the same is true: your read-side SELECT event_date, SUM(amount) FROM revenue_daily GROUP BY event_date collapses partial sums correctly without FINAL.
For ReplacingMergeTree, you do sometimes need FINAL (or a window-function dedup) to get the latest version per key. Even there, prefer applying it to a narrow query — a single tenant, a single date range — rather than the full table.
A useful rule: if you find yourself adding FINAL to a hot query, it is almost always a sign that the destination engine is the wrong choice for the read pattern, not that FINAL is the missing piece.
Common pitfalls
A few patterns that show up in nearly every ClickHouse engagement:
Treating the view as live SQL. The view does not re-run on ALTER TABLE source UPDATE or DELETE FROM source WHERE .... Mutations on the source table do not propagate. If you delete data from events, you have to delete the corresponding aggregates from events_daily separately.
Forgetting that one insert into the source produces one block into the view. A million-row insert produces one partial aggregate per GROUP BY key. Ten million one-row inserts produce ten million separate partial aggregates, which is a parts-count disaster. Always batch writes — either client-side or via async inserts.
Indexing the destination by the wrong key. The destination’s ORDER BY should match the dimensions most queries filter and group by. A common mistake is ORDER BY (event_date) when most queries are WHERE tenant_id = ? AND event_date >= ?. The right order is (tenant_id, event_date). See our PARTITION BY / ORDER BY / PRIMARY KEY guide for the full mental model.
Wrapping high-cardinality columns in LowCardinality without thinking. LowCardinality(String) is a fantastic compression and pruning tool for columns with fewer than a few hundred thousand distinct values, but on high-cardinality columns it makes things worse. Our LowCardinality guide covers the cutoff.
Chaining views and not noticing the cost. ClickHouse lets you cascade — a view on a view on a view. Every insert into the original source then triggers a tree of writes. That is sometimes the right design (a star-schema rollup feeding a final mart table), but it is also where parts explode if any one of the intermediate destinations is unwise about its ORDER BY.
Not testing the -State / -Merge math. Aggregate functions in ClickHouse have subtle parameterization. uniqState is HyperLogLog. quantileState requires a level argument. argMaxState requires both value and tie-breaker columns. Write an integration test that inserts a known dataset and asserts the read-side -Merge output. The first time the test fails is always the cheapest time to find out.
Forgetting that SELECT * does not work inside the view. The SELECT body needs to explicitly list every column, with the correct alias matching the destination. SELECT * from a wide source will quietly mis-route or drop columns. Always be explicit.
A practical workflow for adding a materialized view
When you are ready to add one:
- Identify the read pattern that is too expensive. Run
system.query_logfor the slowest queries on the source table. If they are all aggregating on the same dimensions, that is your rollup. - Choose the destination engine.
SummingMergeTreefor plainSUM/COUNT;AggregatingMergeTreeforuniq/argMax/quantile/etc.;ReplacingMergeTreefor latest-version-per-key. - Create the destination table empty. Pick the
ORDER BYto match the read predicates (low-to-high cardinality leading column, time as a secondary). SkipPARTITION BYunless you need partition-level lifecycle ops. - Create the view without
POPULATE. Confirm new rows flow through end-to-end with a small test insert. - Backfill in date-range chunks.
INSERT INTO destination SELECT ... FROM source WHERE event_date BETWEEN x AND y, one window at a time. Verify after each chunk. - Add a read-side integration test. Insert a deterministic dataset, run the dashboard query, assert the output. Run it in CI.
- Watch
system.partsfor the destination for the first 24 hours. Confirm the merge cycle is keeping the part count under a few hundred per partition. If it is climbing, your insert batches are too small or yourORDER BYis fighting the workload.
The materialized view is a long-lived production object — schema-coupled to two tables, behavior-coupled to the ingest pipeline. Treat it like a database migration, not a one-off script.
Wrapping up
ClickHouse materialized views are the right tool when the same GROUP BY runs on every dashboard and the math can be done once at write time. They are also one of the easiest features to misuse, because the mental model from other databases — “a materialized view is a refreshable snapshot of a query” — is wrong here. In ClickHouse, the view is an insert trigger, the destination is the table that actually holds your data, and the engine on that destination decides whether your rollup converges correctly.
The patterns that work in production: AggregatingMergeTree for non-trivial aggregates, SummingMergeTree for plain sums, async inserts for any high-rate ingest, controlled chunked backfills instead of POPULATE, and explicit column aliases that match the destination. The patterns that fail: POPULATE on a hot table, FINAL on a hot dashboard, untested -State/-Merge pairs, schema drift after an ALTER TABLE, and treating the view as a live SQL query rather than a write-time trigger.
If your real bottleneck is layout — the source table’s ORDER BY is wrong, or projections are a better tool for your access pattern — pair this with our guides to ClickHouse projections and PARTITION BY / ORDER BY / PRIMARY KEY. And if you are setting all this up and want a faster day-to-day workflow than the CLI, our roundup of the best ClickHouse GUI tools covers the options. For the broader ecosystem, QueryPlane’s ClickHouse integration is the fastest way to ship a dashboard or internal tool against a AggregatingMergeTree rollup.