DuckDB for Business Intelligence Without a Data Warehouse
DuckDB replaces a full data warehouse for most BI workloads. Here's how to build dashboards, reports, and analytics without Snowflake or BigQuery.
DuckDB for Business Intelligence Without a Data Warehouse
Most companies don't need Snowflake. They need fast answers to business questions, and DuckDB can provide them from a file on your laptop — no data warehouse, no monthly bill, no infrastructure team.
This guide shows you how to use DuckDB as your BI layer: connecting it to your data sources, building the queries that power dashboards, and serving results to visualization tools.
The Problem with Traditional BI Stacks#
The classic BI stack looks like this:
Source Systems → ETL Pipeline → Data Warehouse → BI Tool → Dashboards
Every component has a cost:
- Data warehouse (Snowflake/BigQuery): $300-3,000+/month
- ETL tool (Fivetran/dbt Cloud): $500-2,000+/month
- BI tool (Looker/Tableau): $1,500-6,000+/seat/year
- Total: $5,000-20,000+/month for a mid-sized team
For startups and small teams, this is prohibitive. And for most analytical workloads, it's also unnecessary.
The DuckDB BI Stack#
Source Systems → DuckDB → Dashboards
- DuckDB: Free (open source)
- Visualization: Grafana (free), Evidence (free), Metabase (free self-hosted), or custom HTML/JS
- ETL: SQL queries in DuckDB, scheduled with cron
- Total cost: $0 (beyond compute)
Connecting DuckDB to Your Data Sources#
CSV and Parquet Files#
-- Query CSV directly
SELECT * FROM read_csv_auto('sales_data.csv');
-- Query multiple Parquet files
SELECT * FROM read_parquet('s3://my-bucket/events/2026/*.parquet');
-- Create a view over a file
CREATE VIEW monthly_sales AS
SELECT
DATE_TRUNC('month', sale_date) AS month,
region,
SUM(amount) AS revenue
FROM read_csv_auto('sales_data.csv')
GROUP BY month, region;PostgreSQL (Production Database)#
INSTALL postgres;
LOAD postgres;
ATTACH 'host=prod-db.internal dbname=app user=analytics' AS prod (TYPE POSTGRES);
-- Create a local cache table
CREATE TABLE customers_cache AS SELECT * FROM prod.customers;
-- Or query live (slower but always fresh)
SELECT * FROM prod.customers WHERE created_at > '2026-01-01';REST APIs via httpfs#
INSTALL httpfs;
LOAD httpfs;
-- Query a JSON API
SELECT * FROM read_json_auto('https://api.example.com/v1/metrics?period=monthly');Google Sheets (via CSV export URL)#
-- Google Sheets published as CSV
SELECT * FROM read_csv_auto(
'https://docs.google.com/spreadsheets/d/YOUR_ID/export?format=csv'
);Building a BI Schema#
Structure your DuckDB BI database with a simple star schema:
-- Fact table
CREATE TABLE fact_revenue (
date DATE,
product_id VARCHAR,
customer_id VARCHAR,
region VARCHAR,
amount DECIMAL(10,2),
units INTEGER
);
-- Dimension tables
CREATE TABLE dim_product (
product_id VARCHAR PRIMARY KEY,
name VARCHAR,
category VARCHAR,
unit_price DECIMAL(10,2)
);
CREATE TABLE dim_customer (
customer_id VARCHAR PRIMARY KEY,
name VARCHAR,
segment VARCHAR,
country VARCHAR
);
CREATE TABLE dim_date (
date DATE PRIMARY KEY,
year INTEGER,
quarter INTEGER,
month INTEGER,
week INTEGER,
day_of_week INTEGER,
is_weekend BOOLEAN
);Populate dim_date once:
INSERT INTO dim_date
SELECT
d AS date,
YEAR(d) AS year,
QUARTER(d) AS quarter,
MONTH(d) AS month,
WEEK(d) AS week,
DAYOFWEEK(d) AS day_of_week,
DAYOFWEEK(d) IN (1, 7) AS is_weekend
FROM generate_series(DATE '2020-01-01', DATE '2030-12-31', INTERVAL '1 day') AS t(d);Core BI Queries#
Revenue Dashboard#
-- Current month KPIs
SELECT
SUM(amount) AS total_revenue,
COUNT(*) AS transactions,
COUNT(DISTINCT customer_id) AS unique_customers,
AVG(amount) AS avg_transaction_value
FROM fact_revenue
WHERE DATE_TRUNC('month', date) = DATE_TRUNC('month', CURRENT_DATE);
-- Month-over-month comparison
WITH current_month AS (
SELECT SUM(amount) AS revenue
FROM fact_revenue
WHERE DATE_TRUNC('month', date) = DATE_TRUNC('month', CURRENT_DATE)
),
prior_month AS (
SELECT SUM(amount) AS revenue
FROM fact_revenue
WHERE DATE_TRUNC('month', date) = DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month')
)
SELECT
current_month.revenue AS current_revenue,
prior_month.revenue AS prior_revenue,
ROUND((current_month.revenue - prior_month.revenue) / prior_month.revenue * 100, 1) AS mom_growth_pct
FROM current_month, prior_month;Product Performance Report#
SELECT
p.category,
p.name AS product,
SUM(f.amount) AS revenue,
SUM(f.units) AS units_sold,
COUNT(DISTINCT f.customer_id) AS unique_buyers,
ROUND(SUM(f.amount) / SUM(SUM(f.amount)) OVER () * 100, 1) AS revenue_share_pct,
RANK() OVER (PARTITION BY p.category ORDER BY SUM(f.amount) DESC) AS rank_in_category
FROM fact_revenue f
JOIN dim_product p ON f.product_id = p.product_id
WHERE YEAR(f.date) = 2026
GROUP BY p.category, p.name
ORDER BY revenue DESC;Customer Cohort Retention#
WITH first_purchase AS (
SELECT customer_id, MIN(date) AS cohort_date
FROM fact_revenue
GROUP BY customer_id
),
monthly_activity AS (
SELECT
fp.customer_id,
DATE_TRUNC('month', fp.cohort_date) AS cohort_month,
DATE_TRUNC('month', f.date) AS activity_month,
DATEDIFF('month', fp.cohort_date, f.date) AS months_after_signup
FROM fact_revenue f
JOIN first_purchase fp ON f.customer_id = fp.customer_id
)
SELECT
cohort_month,
months_after_signup,
COUNT(DISTINCT customer_id) AS retained_customers
FROM monthly_activity
GROUP BY cohort_month, months_after_signup
ORDER BY cohort_month, months_after_signup;Connecting to Visualization Tools#
Metabase (Free, Self-Hosted)#
Metabase supports DuckDB via JDBC. Add DuckDB as a data source:
- Install Metabase:
docker run -p 3000:3000 metabase/metabase - Add DuckDB connection: Settings → Databases → Add database → DuckDB
- Point to your
.duckdbfile path
Evidence (Free, Code-Based)#
Evidence is a Markdown-based BI tool that queries DuckDB natively:
---
title: Revenue Dashboard
---
<script>
const revenue = await query(`
SELECT DATE_TRUNC('month', date) AS month, SUM(amount) AS revenue
FROM fact_revenue
GROUP BY month
ORDER BY month
`);
</script>
<LineChart data={revenue} x="month" y="revenue" title="Monthly Revenue"/>Grafana#
Install the DuckDB plugin, then query with SQL directly in Grafana dashboards.
Custom HTML Dashboard with DuckDB-WASM#
Run DuckDB entirely in the browser:
<script type="module">
import * as duckdb from 'https://cdn.jsdelivr.net/npm/@duckdb/duckdb-wasm/+esm';
const JSDELIVR_BUNDLES = duckdb.getJsDelivrBundles();
const bundle = await duckdb.selectBundle(JSDELIVR_BUNDLES);
const worker = new Worker(bundle.mainWorker);
const logger = new duckdb.ConsoleLogger();
const db = new duckdb.AsyncDuckDB(logger, worker);
await db.instantiate(bundle.mainModule, bundle.pthreadWorker);
const conn = await db.connect();
await conn.query(`SELECT 1 AS test`);
</script>DenchClaw as a DuckDB BI Platform#
DenchClaw is built on top of DuckDB and includes an App Builder that lets you create dashboards directly against your CRM data. No external BI tool needed.
// Inside a DenchClaw app
const revenue = await dench.db.query(`
SELECT
DATE_TRUNC('month', "Close Date") AS month,
SUM("Deal Value") AS revenue,
COUNT(*) AS deals_closed
FROM v_deals
WHERE "Stage" = 'Closed Won'
GROUP BY month
ORDER BY month
`);
// Render with Chart.js
new Chart(ctx, {
type: 'bar',
data: {
labels: revenue.map(r => r.month),
datasets: [{ data: revenue.map(r => r.revenue) }]
}
});The entire BI stack runs locally: DuckDB handles the queries, the DenchClaw app renders the charts, and you own all the data.
Frequently Asked Questions#
Can DuckDB replace a data warehouse for my startup?#
For most startups under $10M ARR, yes. DuckDB handles hundreds of millions of rows analytically on a modern laptop. The use cases where you need a proper data warehouse: multi-terabyte datasets, real-time streaming ingestion, or dozens of concurrent analysts.
How do I schedule DuckDB ETL jobs?#
Use cron (Linux/Mac) or Task Scheduler (Windows) to run SQL scripts on a schedule. Or use dbt with the DuckDB adapter.
Can multiple analysts query the same DuckDB database?#
In read-only mode, yes — multiple processes can open the same file simultaneously. For writes (ETL), only one process can write at a time.
What BI tools have native DuckDB support?#
Evidence, Rill, Metabase (via JDBC), and several others have native DuckDB support. Grafana requires a plugin.
How do I handle DuckDB schema migrations?#
DuckDB supports ALTER TABLE ADD COLUMN, RENAME, and most standard DDL. For complex migrations, create a new table and migrate data, then drop the old table.
Ready to try DenchClaw? Install in one command: npx denchclaw. Full setup guide →
