Menu
Blog Documentation Community Pricing Demo Call Sign Up
Sign Up

SQL Window Functions: A Practical Guide with Examples

What SQL window functions do, how OVER, PARTITION BY, and frames actually work, and the patterns that turn slow self-joins into single-pass queries.

General

This post was written by an engineer at QueryPlane. QueryPlane is an app builder for your database: bring your own postgres db and you can create interactive applications to share with other developers, coworkers or even your customers. If you’re interested in trying it out, get started here.


A SQL window function performs a calculation across a set of rows that are related to the current row, without collapsing those rows into a single result the way GROUP BY does. They were standardized in SQL:2003 and are now supported in PostgreSQL, MySQL 8+, SQL Server, Oracle, SQLite 3.25+, Snowflake, BigQuery, Redshift, and ClickHouse. If you have ever written a self-join just to add a “previous row” column, a window function is the cleaner answer.

In this post, we’ll cover:

  • What OVER actually does - The core mechanic that makes a function “windowed”
  • PARTITION BY and ORDER BY - Slicing the result and ordering inside each slice
  • Ranking, offset, and aggregate windows - The three families you will use daily
  • Frames (ROWS vs RANGE) - The detail that silently breaks running totals
  • Patterns - Top-N per group, gaps and islands, sessionization, deduplication
  • Pitfalls - Filtering, performance, and dialect differences
  • Try it yourself - Run every example in your browser

What OVER actually does

The OVER clause is what turns a regular function into a window function. It tells the database “do not collapse rows — instead, evaluate this function over a window of rows defined relative to each row in the result.”

SELECT
  user_id,
  placed_at,
  amount,
  SUM(amount) OVER (PARTITION BY user_id ORDER BY placed_at) AS running_total
FROM orders;

You get one output row per input row, plus a new column whose value is computed by looking at other rows in the same window. No GROUP BY, no self-join. The same query without OVER would be three separate queries: the row itself, the per-user total, and a join to align them.

The window for any given row is defined by three things inside the parentheses: an optional PARTITION BY (which slices the result into independent groups), an optional ORDER BY (which orders rows inside each partition), and an optional frame clause (which restricts which of the ordered rows count). Everything else about window functions is built on those three knobs.

PARTITION BY and ORDER BY

PARTITION BY slices the result into buckets and runs the window function independently in each bucket. ORDER BY sorts rows inside the bucket so functions like LAG, RANK, and running aggregates have a meaningful “previous” and “next.”

SELECT
  region,
  rep_name,
  revenue,
  RANK() OVER (PARTITION BY region ORDER BY revenue DESC) AS rep_rank_in_region
FROM sales;

Each region gets its own ranking starting at 1. A rep ranked #1 in EMEA and a rep ranked #1 in AMER are both shown as rep_rank_in_region = 1. Drop the PARTITION BY and you get a single global ranking across all regions instead. This is the most common shape for analytical queries: partition by the entity you want to compare within, order by the metric you care about.

If you reuse the same window across multiple columns, name it once with WINDOW:

SELECT
  region,
  rep_name,
  revenue,
  RANK()        OVER w AS rep_rank,
  SUM(revenue) OVER w AS region_total,
  revenue * 1.0 / SUM(revenue) OVER w AS share_of_region
FROM sales
WINDOW w AS (PARTITION BY region ORDER BY revenue DESC);

Same window, three derived columns, one named definition. PostgreSQL, MySQL 8+, SQL Server, and Snowflake all support the WINDOW clause.

The three families of window functions

Ranking functions

ROW_NUMBER, RANK, DENSE_RANK, and NTILE assign a position to each row within a partition. They differ in how they handle ties:

SELECT
  product_id,
  sold_at,
  ROW_NUMBER() OVER (PARTITION BY product_id ORDER BY sold_at)  AS rn,
  RANK()       OVER (PARTITION BY product_id ORDER BY sold_at)  AS rk,
  DENSE_RANK() OVER (PARTITION BY product_id ORDER BY sold_at)  AS drk
FROM sales;

ROW_NUMBER always assigns a strictly increasing integer (1, 2, 3, 4) even if two rows tie. RANK gives tied rows the same number and leaves a gap (1, 2, 2, 4). DENSE_RANK gives tied rows the same number with no gap (1, 2, 2, 3). For “give me the latest row per group,” reach for ROW_NUMBER. For “give me everyone tied for first,” reach for RANK. NTILE(n) divides the partition into n roughly equal buckets — useful for quartile and decile reporting.

Offset functions

LAG and LEAD look backward and forward from the current row inside the partition:

SELECT
  user_id,
  event,
  occurred_at,
  LAG(event)  OVER (PARTITION BY user_id ORDER BY occurred_at) AS prev_event,
  LEAD(event) OVER (PARTITION BY user_id ORDER BY occurred_at) AS next_event,
  occurred_at - LAG(occurred_at) OVER (PARTITION BY user_id ORDER BY occurred_at) AS time_since_prev
FROM events;

This is the easiest way to compute deltas, detect transitions, or build event sequences without self-joining. FIRST_VALUE and LAST_VALUE return the first or last row in the (frame of the) partition — handy for “compare each row to the partition’s anchor row” patterns. Note that LAG and LEAD return NULL at the partition edges, which is exactly the kind of thing COALESCE is designed to handle.

Aggregate windows

Any aggregate function — SUM, AVG, MIN, MAX, COUNT, STDDEV, even STRING_AGG — becomes a window function when you add OVER. This is how you get running totals, moving averages, and per-partition shares without a subquery:

SELECT
  account_id,
  txn_at,
  amount,
  SUM(amount) OVER (PARTITION BY account_id ORDER BY txn_at) AS running_balance,
  AVG(amount) OVER (PARTITION BY account_id
                    ORDER BY txn_at
                    ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS avg_last_7_txns
FROM transactions;

The first column is a running balance. The second is a 7-row moving average. Both are computed in a single pass over the table, with no joins. That second example also introduces the frame clause, which is where things get interesting.

See what QueryPlane can build for you

Connect to your database, write SQL with AI, and build shareable apps — all from your browser.

Frames: ROWS vs RANGE

When a window function has an ORDER BY, it gets a frame — the subset of partition rows that the function actually sees. The default frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, which sounds the same as ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW but behaves differently as soon as you have ties in the ORDER BY column.

ROWS counts physical rows. RANGE counts logical values — every row whose ORDER BY value is “less than or equal to” the current row’s value is in the frame, ties included. Two rows with identical timestamps share the same frame under RANGE, but get separate frames under ROWS (assigned in some unspecified order).

The classic gotcha is a running total over a non-unique timestamp:

-- Wrong if there are duplicate sold_at values:
SELECT
  product_id,
  sold_at,
  amount,
  SUM(amount) OVER (PARTITION BY product_id ORDER BY sold_at) AS running_total
FROM sales;

With the default RANGE frame, every row tied on sold_at gets the same running total — the sum up to and including all rows with that timestamp. If you wanted a strictly-per-row running total, force ROWS:

SUM(amount) OVER (
  PARTITION BY product_id
  ORDER BY sold_at, sale_id
  ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total

For moving averages, ROWS BETWEEN N PRECEDING AND CURRENT ROW is the right choice almost always — you want exactly N rows back, not “every row whose value is within N.” Reach for RANGE only when the ORDER BY is on a numeric or date column and you genuinely want a value-based window (e.g. “sum of amounts in the last 7 days, regardless of how many rows that is”). PostgreSQL supports RANGE BETWEEN INTERVAL '7' DAY PRECEDING AND CURRENT ROW for exactly that case.

Useful patterns

Top-N per group

Pulling “the latest row per user” or “the top 3 products per category” is the canonical window function trick:

WITH ranked AS (
  SELECT
    user_id,
    occurred_at,
    event_type,
    ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY occurred_at DESC) AS rn
  FROM events
)
SELECT user_id, occurred_at, event_type
FROM ranked
WHERE rn = 1;

This replaces a correlated subquery or a self-join with a CTE that the planner can usually execute as a single sort. Change WHERE rn = 1 to WHERE rn <= 3 and you have top-3-per-group.

Deduplication

Same shape, different framing — pick one row per (logical) duplicate group:

WITH ranked AS (
  SELECT
    *,
    ROW_NUMBER() OVER (
      PARTITION BY email
      ORDER BY updated_at DESC, id DESC
    ) AS rn
  FROM users
)
SELECT * FROM ranked WHERE rn = 1;

The ORDER BY inside the window decides which duplicate wins. Add a tiebreaker (id DESC) to make the choice deterministic.

Running totals and moving averages

Already covered above, but worth restating because it is the single most common reason teams reach for window functions. Anything that previously required SELECT ... FROM t t1 JOIN t t2 ON t2.id <= t1.id collapses into a one-line SUM(...) OVER (ORDER BY id). Performance is dramatically better — single pass instead of an O(n²) join — and the SQL is easier to read.

Gaps and islands

Identify runs of consecutive values, like “find every period of consecutive days a user was active”:

WITH grouped AS (
  SELECT
    user_id,
    activity_date,
    activity_date - (ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY activity_date)) * INTERVAL '1 day' AS group_key
  FROM daily_activity
)
SELECT
  user_id,
  MIN(activity_date) AS streak_start,
  MAX(activity_date) AS streak_end,
  COUNT(*) AS streak_days
FROM grouped
GROUP BY user_id, group_key;

The trick is that activity_date - row_number * interval produces the same group_key for every row in a consecutive run. Group by it and you get streaks. This pattern generalizes to gaps in invoice numbers, contiguous billing periods, or any other “consecutive run” question.

Sessionization

Define a session as “events from the same user with less than 30 minutes between them”:

WITH gaps AS (
  SELECT
    user_id,
    occurred_at,
    CASE
      WHEN occurred_at - LAG(occurred_at) OVER (PARTITION BY user_id ORDER BY occurred_at)
           > INTERVAL '30 minutes' THEN 1
      ELSE 0
    END AS is_new_session
  FROM events
)
SELECT
  user_id,
  occurred_at,
  SUM(is_new_session) OVER (PARTITION BY user_id ORDER BY occurred_at) AS session_id
FROM gaps;

LAG finds the gap, a CASE flips it into a 0/1 marker, and a windowed SUM cumulatively counts new sessions. Three layered windows produce the session id without a single join. This is the same pattern most product analytics tools use under the hood.

Pitfalls to watch for

You cannot filter on a window function in WHERE

Window functions are evaluated after WHERE and GROUP BY but before ORDER BY and LIMIT. That means you cannot reference rn = 1 from a window function in the same WHERE clause:

-- This fails:
SELECT *, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY occurred_at DESC) AS rn
FROM events
WHERE rn = 1;     -- ERROR: column "rn" does not exist

Wrap it in a CTE or subquery and filter on the outside, as in the top-N example above. PostgreSQL 16+ added QUALIFY support behind a flag in some forks, and Snowflake / BigQuery / ClickHouse all support QUALIFY rn = 1 directly — but the CTE form works everywhere.

ORDER BY in OVER is not the same as ORDER BY in the query

ORDER BY inside OVER controls the window only. The final result row order is still controlled by the query’s outer ORDER BY. If you want the result rows sorted by the same key the window uses, you have to write it twice (or rely on the planner to notice). Forgetting this is a common source of “the column is computed correctly but the rows are in the wrong order” bugs.

Frame defaults bite

Repeating the earlier point because it causes more silent wrong answers than any other window-function issue: when you write SUM(x) OVER (ORDER BY t), the default frame is RANGE, not ROWS. If t has duplicates, your “running total” is not what you think. Always specify ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (or break ties in the ORDER BY) when you want strictly per-row accumulation.

Performance hinges on the sort, not the window

A window function with PARTITION BY a ORDER BY b is essentially “sort by (a, b), then sweep through.” If an index covers (a, b) in the right direction, the planner can skip the sort and run the window in one pass. If not, the database materializes and sorts every partition. For wide tables and large partitions, this is the difference between a sub-second query and a query that spills to disk. Run EXPLAIN ANALYZE and look for a “WindowAgg” node sitting on top of an “Index Scan” rather than a “Sort.” For Postgres, this pattern is well documented in the official tutorial on window functions.

Dialect differences

  • MySQL added window functions in 8.0 — anything older has none of this.
  • SQLite added them in 3.25 (2018). The frame syntax is otherwise standard.
  • BigQuery and Snowflake support QUALIFY for filtering on window functions inline.
  • ClickHouse added full window function support relatively recently — older versions used the arrayJoin / groupArray pattern instead, which still works but reads worse.
  • Redshift supports the standard syntax but historically had quirks around frame clauses with RANGE. Modern RA3 nodes are well-behaved.

Try these examples yourself

Paste any of the queries above into our free SQL playground to try OVER, PARTITION BY, and frame clauses against PostgreSQL, MySQL, SQLite, and ClickHouse — no account or database connection needed. For pre-flight checks on more complex window queries, the SQL syntax checker catches frame and partition mistakes before you run them in production.

For the full workflow — write window-function-heavy SQL with AI, run it against your real PostgreSQL database, and turn the results into a dashboard or internal tool — connect a database to QueryPlane and skip straight to building.

Wrapping up

Window functions are the part of SQL that most engineers underuse, even though they replace a long list of awkward self-joins and correlated subqueries with single-pass queries that the planner can actually optimize. The mental model is simple: OVER says “look at a window of related rows,” PARTITION BY slices the result, ORDER BY orders inside each slice, and the frame clause picks which of the ordered rows the function actually sees.

Reach for window functions when you find yourself writing self-joins to add “previous row” columns, when you need top-N per group, or when you want a running total or moving average. Watch the frame default — RANGE and ROWS are not the same once duplicates show up. And remember that the WHERE clause cannot see the window function’s output — wrap in a CTE and filter outside.

Pair LAG and LEAD with COALESCE at partition edges, and you can express most “compare each row to the previous one” patterns in one clean query.