BigQuery Partitioning and Clustering in Practice
How to pick partitioning vs clustering in BigQuery: pruning, require_partition_filter, the 4-column cluster limit, and design rules that cut scans.
BigQuery
This post was written by an engineer at QueryPlane. QueryPlane is an app builder for your database: bring your own postgres db and you can create interactive applications to share with other developers, coworkers or even your customers. If you’re interested in trying it out, get started here.
BigQuery bills you for bytes scanned, not rows returned. That makes the physical layout of your tables — which partitions exist, which columns the data is sorted by inside each partition — the single biggest knob on your query cost. Partitioning and clustering are how you turn the layout into something the planner can prune at compile time, and getting them wrong (or skipping them entirely) is the most common reason a 1 GB query somehow scans 4 TB.
This post is about the decisions, not the syntax. The syntax for PARTITION BY and CLUSTER BY is one line in a CREATE TABLE statement. The hard part is choosing the partitioning column, deciding how many clustering keys to use, knowing when not to partition, and structuring WHERE clauses so partition pruning actually fires.
In this post, we’ll cover:
- What partitioning actually does - segments + partition pruning at planning time
- What clustering actually does - sorted blocks inside each partition, narrowed at scan time
- When to use which (and when to use both) - the cardinality and filter-shape rules
- Choosing the partitioning column - time-unit vs ingestion-time vs integer range
- Choosing clustering keys - the 4-column limit, ordering matters, cardinality bounds
require_partition_filter- the one-line cost control every team should turn on- Automatic reclustering - free, asynchronous, and what it does not fix
- Pitfalls - the patterns that silently disable pruning
What partitioning actually does
A partitioned BigQuery table is split into one storage segment per partition value. When you query a partitioned table with a filter on the partition column, the planner runs partition pruning — it inspects the metadata for each partition and skips any partition that cannot match the predicate. Pruning happens before any data is read, which is why BigQuery’s query-cost estimator can tell you the exact number of bytes a query will scan before you click “Run.”
The mental model is simple: partitions are folders, and WHERE order_date = '2025-01-01' tells BigQuery to look in exactly one folder. Without the filter, it has to read every folder.
There are three partitioning schemes in BigQuery, and they are not interchangeable.
Time-unit column partitioning partitions by the value of a DATE, TIMESTAMP, or DATETIME column you choose. You can pick HOUR, DAY, MONTH, or YEAR granularity. The right choice depends on how much data lands per unit and how queries filter — daily is the default for most analytics tables, hourly is appropriate when you have hundreds of millions of rows per day and queries routinely filter by hour, and monthly is right when daily would produce thousands of partitions each holding a few hundred rows.
Ingestion-time partitioning partitions by when BigQuery received each row. The table has a pseudo-column called _PARTITIONTIME (and a date-only alias _PARTITIONDATE) that you filter against; the column does not appear in the row data itself. This is the right choice when “the day the row arrived” is the natural filtering dimension — log streams, append-only event tables, raw landing tables. It is also the only scheme that does not require you to dedicate a column of the row to the partition value.
Integer range partitioning partitions by buckets of an INT64 column you choose. You specify a start, end, and interval — e.g. PARTITION BY RANGE_BUCKET(customer_id, GENERATE_ARRAY(0, 1000000, 1000)) — and BigQuery puts each row into the matching bucket. Use this when the natural filter dimension is an integer key (customer_id, account_id, tenant_id) and the cardinality of that key is high enough that grouping into ranges is meaningful.
There is also a hard limit you’ll hit if you’re not careful: 4,000 partitions per table. Daily partitioning gets you about 11 years; hourly gets you about 5 months. If you need to keep more than that, either coarsen the granularity, use partition expiration to drop old partitions automatically, or shard the table.
What clustering actually does
Clustering does not split the table. It sorts the data inside each partition by the columns you nominate, then groups the sorted data into blocks. When a query filters or aggregates on a clustered column, BigQuery scans only the blocks whose value ranges overlap the predicate — the rest of the partition is skipped at scan time.
A few properties matter in practice:
The decision to scan or skip is made per block, not per row. Block sizes vary, but they are large enough that clustering only meaningfully helps tables (or partitions) above about 64 MB. Below that, the whole table fits in a handful of blocks and clustering rarely changes the byte count.
The byte estimate you see before a query runs is the worst case. Unlike partition pruning — where the planner can prove from metadata exactly which partitions are needed — clustering pruning is decided per block during execution. You will frequently see a query priced at 800 MB and actually scan 120 MB. This is a feature, not a bug, but it means clustering does not give you the same upfront cost predictability that partitioning does.
You can specify up to four clustering columns, and the order matters. Data is sorted by the first column, then by the second within ties, and so on — the same way a composite B-tree index works in Postgres. Filters on the leading columns prune efficiently; filters that skip the leading column and hit only later columns do not.
When to use partitioning, clustering, or both
Most production tables want both, but the failure mode of “always do both” is choosing the wrong combination and getting almost no benefit. The rule of thumb that works:
Partition by your primary time or coarse-bucket filter dimension. This is almost always a date/timestamp column, occasionally an integer like customer_id for multi-tenant tables. Pick the one column that essentially every query against this table will include in its WHERE clause.
Cluster by the columns that are filtered in addition to the partition column. These are the dimensions analysts and dashboards routinely group by or filter on — country, event_type, product_id, user_id. Order them from most-frequently-filtered to least.
Use clustering instead of partitioning when the would-be partition column has very high cardinality (so partitioning would create too many small partitions), or when you need to filter on multiple high-cardinality columns and partitioning can only cover one of them. Google’s own recommendation is to prefer clustering over partitioning when the column has more than 10,000 distinct values, because the 4,000-partition cap becomes an architectural constraint at that point.
Skip partitioning entirely when the table is smaller than 1 GB, or when queries do not filter by any one dimension consistently. Partitioning a tiny table just adds metadata overhead. Clustering can still help if queries do filter, but for small tables it often does not change the bill enough to matter.
See what QueryPlane can build for you
Connect to your database, write SQL with AI, and build shareable apps — all from your browser.
Choosing the partitioning column
The wrong partitioning column is much worse than no partitioning, because it locks you into a layout that the optimizer cannot prune. A few rules.
Pick the column queries actually filter on. Sounds obvious. The check is: pull INFORMATION_SCHEMA.JOBS for the last 30 days, look at the top 50 queries against this table, and verify that the filter dimension you’re about to commit to appears in their WHERE clauses. If it does not, you’re partitioning the wrong way.
Prefer time-unit column partitioning over ingestion-time when the row has a natural event timestamp. Ingestion time is convenient because it requires no schema change, but it ties partition membership to when BigQuery received the row, not when the event happened. If you ever need to backfill, replay, or correct historical data, time-unit column partitioning lets you put rows in the right partition automatically; ingestion-time partitioning puts them all in today’s partition.
For multi-tenant tables, integer range partitioning on tenant_id is often better than clustering. Single-tenant queries pre-prune at the planner level rather than block-pruning during execution, which gives you the cost predictability you usually want for chargeback or quota enforcement. The catch is the partition cap: pick an interval that keeps the partition count below 4,000.
Do not partition on a column you compute in the query. Partition pruning requires the predicate to reference the raw partition column directly. WHERE DATE(event_ts) = '2025-01-01' prunes if event_ts is the partition column; WHERE EXTRACT(YEAR FROM event_ts) = 2025 may or may not, depending on the function and BigQuery’s partition pruning rules. The safe pattern is to filter with a constant or constant-folded predicate on the column itself.
Choosing clustering keys
The order of clustering columns is the most-overlooked part of the design. Because BigQuery sorts data by the columns in the order you declare them, the cluster keys behave like a left-to-right prefix index: a filter on (country, event_type) prunes well, a filter on just country prunes well, but a filter on just event_type does not.
A few principles.
Order from highest-selectivity-when-filtered to lowest. If country is in 90% of queries and event_type in 30%, declare CLUSTER BY country, event_type. The reverse order will not prune for queries that filter only on country.
Avoid clustering on a column with very few distinct values. Clustering on a boolean column gives you exactly two regions; the filter prunes 50% of data at best. Conversely, clustering on a column with billions of distinct values (like event_id or request_id) is also a poor fit — the per-block ranges become so narrow that the planner cannot use them effectively, and you would have been better off with an index in a different system. The sweet spot is dimensions with hundreds to millions of distinct values per partition.
Do not cluster on JSON or STRUCT fields directly. Clustering requires top-level columns of supported scalar types. If you need to cluster on a nested field, extract it into a top-level column first.
Clustering is most effective on string columns of bounded length. Clustering an unbounded STRING (like a free-text description) still works, but the block ranges get long string prefixes that prune less efficiently.
require_partition_filter — the cost control to turn on by default
There is one knob that prevents accidental full-table scans more effectively than any other practice in this post. When you create a partitioned table, you can set require_partition_filter = true, and BigQuery will refuse to run any query against the table that does not filter on the partition column.
CREATE TABLE analytics.events (
event_ts TIMESTAMP,
user_id INT64,
country STRING,
event_type STRING
)
PARTITION BY DATE(event_ts)
CLUSTER BY country, event_type
OPTIONS(
require_partition_filter = TRUE,
partition_expiration_days = 365
);
A query like SELECT count(*) FROM analytics.events against that table returns an error before any bytes are scanned, with a clear message telling the caller to add a filter on event_ts. The first time someone shoves an unfiltered query through a dashboard, you save the cost of the scan; from then on, you’ve taught the next analyst the right pattern by error message.
Turn this on for every partitioned table that more than one person queries. The cost of someone forgetting to filter and burning a few hundred dollars is much higher than the cost of the occasional error message that says “add a date filter.”
There is no equivalent flag for clustering. The closest thing is the maximum bytes billed setting at the job level — useful as a per-query ceiling, but enforced after the planner has estimated the cost, not as a hard schema-level invariant.
Automatic reclustering
When you write data into a clustered table, the new rows arrive in the order they were written, not in the cluster key order. BigQuery handles this by treating the new data as deltas on top of a baseline of fully sorted blocks. Pruning still works against the deltas, but it’s less effective there than against the baseline.
Automatic reclustering runs in the background, merging the deltas back into the baseline as soon as they grow large enough to be worth merging. The merge work runs on Google’s compute, not yours — it does not consume your slots, and it is not billed. BigQuery keeps growing the baseline up to about 500 GB per partition before splitting; beyond that, new baselines stack rather than rewriting the older data.
The practical consequence: you almost never need to think about reclustering as an operational task. The thing it does not do is fix a bad clustering choice. If you discover six months in that your clustering keys are in the wrong order, the automatic process will keep the data sorted by the wrong order forever. The fix is the same as in any sorted-storage system: create a new table with the correct keys, copy the data, swap the views.
Pitfalls
Filtering on a derived value disables partition pruning. WHERE DATE_TRUNC(event_ts, MONTH) = '2025-01-01' will scan every partition, even if event_ts is the partition column. The fix is to filter on the column directly: WHERE event_ts >= '2025-01-01' AND event_ts < '2025-02-01'.
Joining on the partition column without filtering it also disables pruning. FROM events e JOIN dim d ON DATE(e.event_ts) = d.day does not prune unless you also add a literal filter on e.event_ts. The optimizer cannot reason about which days are “in” dim at planning time. Always include an explicit time range on the fact-table side of a join.
Clustering does not help SELECT *. The benefit of clustering is skipping blocks because their column ranges don’t match a filter. If you select every column and don’t filter on a clustered column, you read the whole partition either way. This is the most common reason teams add clustering and see “no improvement.”
require_partition_filter and parameterized queries are subtly tricky. A query like WHERE event_ts >= @start does meet the requirement, but the cost estimator in the BigQuery UI will show the full table cost because it does not know the parameter value at parse time. Run with --use_legacy_sql=false and a real value to see the actual cost.
Wildcard table queries (FROM project.dataset.events_*) do not benefit from partitioning. Each wildcard table is its own table; if you partitioned each one but query across many, the optimizer prunes within each table but not across them. Consolidate into a single partitioned table where possible.
Streaming inserts arrive in the streaming buffer first. Rows in the streaming buffer (which takes minutes to flush to the actual partitions) are not partition-pruned by the same metadata, so very recent rows can show up in partitions you didn’t expect. If your dashboard queries the last 5 minutes and relies on partition filters, account for the buffer latency.
How to roll this out on an existing table
You cannot change the partitioning or clustering of an existing BigQuery table in place. The migration pattern is:
- Create a new table with the desired partitioning and clustering, plus
require_partition_filterif appropriate. - Insert into the new table from the old one, ideally batched by partition to stay under the per-job slot and byte limits.
- Validate row counts and a few representative aggregations.
- Rename the old table to a backup name, rename the new table into place. (BigQuery supports table renames via
ALTER TABLE ... RENAME TO ....) - Update any downstream
CREATE OR REPLACE VIEWdefinitions.
If the table is large enough that a one-shot rewrite is impractical, partitioned by date already, and the new design is partitioned by the same column, you can usually rewrite one partition at a time using INSERT OVERWRITE ... PARTITION — useful when you want to add clustering keys to a table that is already partitioned.
Wrapping up
Partitioning and clustering are not equivalent features and you do not have to choose between them. Partitioning gives you cost predictability and a hard cap on the bytes a query can scan; clustering gives you finer-grained pruning within each partition for a wider set of filter columns. The two are designed to compose, and the layout that pays off in practice — for most large analytics tables — is one partition column on a date or coarse bucket, plus two-to-four clustering keys ordered by filter frequency, plus require_partition_filter = TRUE.
If you only do one thing after reading this, turn on require_partition_filter for every existing partitioned table in your project. It costs nothing, it cannot regress any well-written query, and it eliminates an entire class of accidental scan bills the first time it fires.
If you’re looking for a BigQuery client that surfaces partition and clustering metadata alongside your queries, QueryPlane’s BigQuery integration lets you connect, explore datasets, write SQL, and build dashboards on top of your warehouse — with cost estimates surfaced before each query runs. The best companion to a well-designed table is a query interface that makes the costs visible the moment you write the SQL.
Related reading on warehouse design:
- BigQuery query and cost optimization in practice — once the layout is right, the next lever is the planner: slot-time,
INFORMATION_SCHEMA.JOBS, materialized views, and the rest of the cost surface. - Snowflake clustering keys in practice — the same decision framework, but for Snowflake’s micro-partition model.
- Snowflake warehouse sizing in practice — the compute side of cost optimization.
- Best BigQuery GUI tools — picking the right client for working with your partitioned and clustered tables.