DuckDB Window Functions: A Practical Guide
DuckDB window functions let you compute running totals, rankings, and moving averages without subqueries. Here's a practical guide with real SQL examples.
DuckDB Window Functions: A Practical Guide
Window functions are one of the most powerful features in SQL, and DuckDB implements them fully and fast. They let you perform calculations across a set of rows related to the current row — without collapsing rows the way GROUP BY does.
If you've been writing complex subqueries to compute running totals, rankings, or moving averages, window functions will cut that code in half and run faster.
What Is a Window Function?#
A window function computes a value for each row using a "window" of related rows. The window is defined by OVER (...) clause, which specifies:
- PARTITION BY — divide rows into groups (like GROUP BY, but rows are kept separate)
- ORDER BY — define the order within each partition
- ROWS/RANGE BETWEEN — the frame, i.e. which rows in the partition to include
SELECT
user_id,
revenue,
SUM(revenue) OVER (PARTITION BY user_id ORDER BY occurred_at) AS running_total
FROM events;This computes each user's running total revenue — one row per event, not one row per user.
The Core Window Functions#
Ranking Functions#
SELECT
company_name,
deal_value,
ROW_NUMBER() OVER (ORDER BY deal_value DESC) AS row_num,
RANK() OVER (ORDER BY deal_value DESC) AS rank,
DENSE_RANK() OVER (ORDER BY deal_value DESC) AS dense_rank,
PERCENT_RANK() OVER (ORDER BY deal_value DESC) AS pct_rank
FROM deals;ROW_NUMBER()— always unique, sequentialRANK()— ties get the same rank, next rank skips (1, 2, 2, 4)DENSE_RANK()— ties get the same rank, no skipping (1, 2, 2, 3)PERCENT_RANK()— 0.0 to 1.0 position in the set
Lead and Lag#
Access the value from a row ahead or behind the current row:
SELECT
DATE_TRUNC('month', closed_at) AS month,
SUM(value) AS revenue,
LAG(SUM(value)) OVER (ORDER BY DATE_TRUNC('month', closed_at)) AS prev_month,
LEAD(SUM(value)) OVER (ORDER BY DATE_TRUNC('month', closed_at)) AS next_month
FROM deals
WHERE closed_at IS NOT NULL
GROUP BY month
ORDER BY month;Use LAG(col, 2) to look back 2 rows, LEAD(col, 1, 0) to look forward 1 row with a default of 0 if no row exists.
First and Last Value#
SELECT
user_id,
occurred_at,
event_type,
FIRST_VALUE(event_type) OVER (PARTITION BY user_id ORDER BY occurred_at) AS first_event,
LAST_VALUE(event_type) OVER (
PARTITION BY user_id
ORDER BY occurred_at
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS last_event
FROM events;Note: LAST_VALUE requires the full frame to be explicit — otherwise it defaults to the current row.
NTH_VALUE#
SELECT
user_id,
occurred_at,
NTH_VALUE(event_type, 2) OVER (
PARTITION BY user_id
ORDER BY occurred_at
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS second_event
FROM events;Aggregate Functions as Window Functions#
Any aggregate function can be used as a window function by adding OVER:
SELECT
deal_id,
stage,
value,
SUM(value) OVER () AS total_pipeline,
SUM(value) OVER (PARTITION BY stage) AS stage_total,
value / SUM(value) OVER (PARTITION BY stage) AS pct_of_stage,
AVG(value) OVER (PARTITION BY stage) AS stage_avg,
COUNT(*) OVER (PARTITION BY stage) AS stage_count
FROM deals
WHERE closed_at IS NULL;This is one query — no joins, no subqueries — giving you deal value plus its context within each pipeline stage.
Window Frames#
The frame controls which rows are included in the calculation:
-- Rolling 7-day sum
SUM(revenue) OVER (
ORDER BY day
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
)
-- All preceding rows (running total)
SUM(revenue) OVER (
ORDER BY day
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
)
-- Entire partition
SUM(revenue) OVER (
PARTITION BY user_id
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
-- Range-based: rows within 7 days
SUM(revenue) OVER (
ORDER BY day
RANGE BETWEEN INTERVAL '7 days' PRECEDING AND CURRENT ROW
)ROWS counts physical rows. RANGE uses the ORDER BY value — useful for date ranges where you want "all rows within 7 days of the current row" regardless of how many rows that is.
Practical Recipes#
Sales Pipeline Funnel#
SELECT
stage,
COUNT(*) AS deals,
SUM(value) AS pipeline_value,
SUM(COUNT(*)) OVER (ORDER BY stage_order DESC ROWS UNBOUNDED PRECEDING) AS cumulative_deals,
ROUND(COUNT(*) * 100.0 / FIRST_VALUE(COUNT(*)) OVER (ORDER BY stage_order), 1) AS conversion_pct
FROM deals
JOIN stage_order_ref USING (stage)
GROUP BY stage, stage_order
ORDER BY stage_order;Top N Per Group#
-- Top 3 deals per sales rep
WITH ranked AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY owner_id ORDER BY value DESC) AS rank
FROM deals
WHERE closed_at IS NOT NULL
)
SELECT * FROM ranked WHERE rank <= 3;This is a very common pattern — the window function runs first, then you filter in the outer query.
Customer Lifetime Value Progression#
SELECT
user_id,
occurred_at,
revenue,
SUM(revenue) OVER (
PARTITION BY user_id
ORDER BY occurred_at
ROWS UNBOUNDED PRECEDING
) AS lifetime_value,
COUNT(*) OVER (
PARTITION BY user_id
ORDER BY occurred_at
ROWS UNBOUNDED PRECEDING
) AS purchase_count
FROM events
WHERE event_type = 'purchase'
ORDER BY user_id, occurred_at;Detecting Consecutive Events#
-- Find users who made 3 purchases in a row within 30 days
WITH ordered AS (
SELECT
user_id,
occurred_at,
LEAD(occurred_at, 2) OVER (PARTITION BY user_id ORDER BY occurred_at) AS third_purchase_at
FROM events
WHERE event_type = 'purchase'
)
SELECT DISTINCT user_id
FROM ordered
WHERE third_purchase_at <= occurred_at + INTERVAL '30 days';Window Functions in DenchClaw Analytics#
DenchClaw uses DuckDB as its database engine, which means you can run window functions directly against your CRM data. Want to see pipeline velocity trends, rank your deals by value within each stage, or compute rolling close rates? It's all standard SQL against your local DuckDB database.
-- Pipeline rank within stage
SELECT
"Full Name" AS contact,
"Company" AS company,
"Deal Value" AS value,
"Stage" AS stage,
RANK() OVER (PARTITION BY "Stage" ORDER BY "Deal Value" DESC) AS rank_in_stage
FROM v_deals
WHERE "Status" = 'Active';Performance Tips#
-
Index ORDER BY columns — DuckDB auto-optimizes window functions with sorted data. If you repeatedly window by the same column, keep your data sorted on that column.
-
Avoid multiple scans — if you need several window functions with the same
OVERclause, combine them in one SELECT rather than joining subqueries. -
Filter before windowing — put WHERE clauses before window functions. DuckDB pushes filters down, but explicit filtering reduces the dataset.
-
Use ROWS not RANGE for large frames — RANGE requires a sort-and-merge; ROWS is faster for counting preceding rows.
Frequently Asked Questions#
Can I use WHERE to filter window function results?#
No — you can't reference a window function alias in WHERE. Use a CTE or subquery:
WITH ranked AS (SELECT *, ROW_NUMBER() OVER (...) AS rn FROM t)
SELECT * FROM ranked WHERE rn = 1;What's the difference between PARTITION BY and GROUP BY?#
GROUP BY collapses rows into one row per group. PARTITION BY keeps all rows but restricts the window calculation to each partition. Window functions never reduce row count.
Can I nest window functions?#
No. You can't use a window function inside another window function. Use CTEs to chain them.
Are DuckDB window functions ANSI SQL standard?#
Yes. DuckDB implements the full SQL:2003 window function standard, including all frame types and most aggregate window functions.
How do I compute a 30-day rolling average including only business days?#
Use a calendar table and join to filter weekends before applying the window function. There's no built-in business-day frame in DuckDB.
Ready to try DenchClaw? Install in one command: npx denchclaw. Full setup guide →
