DuckDB Full-Text Search: Find Anything in Your Data

DuckDB full-text search lets you find text across millions of rows instantly. Here's how to set it up and use it in your local analytics stack.

Mark Rachapoom

March 26, 2026·7 min read

DuckDB full-text search lets you run fast text queries across large datasets without spinning up Elasticsearch or a separate search service. If you've got a DuckDB database and you need to find records by text content, this guide shows you exactly how to do it.

This is one of those capabilities that sounds simple but opens up a surprising number of use cases — from searching CRM notes to querying log files to finding relevant entries across thousands of documents.

How DuckDB Full-Text Search Works#

DuckDB's FTS is implemented as an extension (fts). Once loaded, you can create a full-text search index on any table, then query it using the match_bm25 function, which ranks results by BM25 relevance — the same algorithm behind most modern search engines.

The extension creates inverted index structures internally. When you run a search, it scans the index rather than doing a sequential table scan, which is what makes it fast even on large datasets.

Setting Up Full-Text Search#

Step 1: Load the Extension#

INSTALL fts;
LOAD fts;

That's it. No server to configure, no connection string to set up.

Step 2: Create a Table with Text Data#

CREATE TABLE notes (
    id INTEGER PRIMARY KEY,
    title VARCHAR,
    body TEXT,
    created_at TIMESTAMP
);
 
INSERT INTO notes VALUES
    (1, 'Q1 Review', 'Revenue was up 40% in Q1. Pipeline looks strong going into Q2.', NOW()),
    (2, 'Customer Call', 'Sarah from Acme mentioned they might expand their contract.', NOW()),
    (3, 'Churn Analysis', 'Three customers churned this month. Main reason: pricing.', NOW()),
    (4, 'Product Roadmap', 'Next quarter we ship the mobile app and API v2.', NOW());

Step 3: Build the FTS Index#

PRAGMA create_fts_index('notes', 'id', 'title', 'body');

The first argument is the table name. The second is the primary key column. The remaining arguments are the columns to index. You can index as many text columns as you need.

Step 4: Search#

SELECT id, title, fts_main_notes.match_bm25(id, 'pipeline revenue') AS score
FROM notes
WHERE score IS NOT NULL
ORDER BY score DESC;

This returns results ranked by relevance. Records mentioning both "pipeline" and "revenue" rank higher than records that only match one term.

Advanced Search Patterns#

Boolean Queries#

DuckDB FTS supports AND, OR, and NOT operators:

-- Must contain both terms
SELECT id, title, fts_main_notes.match_bm25(id, 'pipeline AND revenue') AS score
FROM notes WHERE score IS NOT NULL ORDER BY score DESC;
 
-- Either term
SELECT id, title, fts_main_notes.match_bm25(id, 'churn OR expansion') AS score
FROM notes WHERE score IS NOT NULL ORDER BY score DESC;
 
-- Exclude a term
SELECT id, title, fts_main_notes.match_bm25(id, 'customer NOT churn') AS score
FROM notes WHERE score IS NOT NULL ORDER BY score DESC;

Combining FTS with Regular Filters#

This is where DuckDB shines — you can mix full-text search with normal SQL predicates:

SELECT id, title, created_at,
    fts_main_notes.match_bm25(id, 'expansion contract') AS score
FROM notes
WHERE score IS NOT NULL
    AND created_at > '2026-01-01'
ORDER BY score DESC
LIMIT 10;

Full-text search with date filtering and result limiting — no separate search service needed.

Field-Specific Search#

If you only want to search specific fields, create a separate index or use field prefixes in the query:

-- Index only the body field
PRAGMA create_fts_index('notes', 'id', 'body', stemmer='porter');

The stemmer option enables stemming — so "pricing" also matches "price" and "priced". Supported stemmers include porter (English), german, french, spanish, and others.

Real-World Use Case: Searching CRM Notes#

DenchClaw uses DuckDB as its core database. Every CRM entry can have an attached document — meeting notes, call summaries, proposals. Here's how you'd implement search across all of them:

-- Assume you have an entry_documents table
CREATE TABLE entry_documents (
    entry_id VARCHAR,
    object_name VARCHAR,
    title VARCHAR,
    content TEXT,
    updated_at TIMESTAMP
);
 
-- Build the index
PRAGMA create_fts_index('entry_documents', 'entry_id', 'title', 'content', stemmer='porter');
 
-- Search across all documents
SELECT 
    entry_id,
    object_name,
    title,
    fts_main_entry_documents.match_bm25(entry_id, 'contract renewal Q2') AS relevance
FROM entry_documents
WHERE relevance IS NOT NULL
ORDER BY relevance DESC
LIMIT 20;

This lets you type a query in natural language and surface the most relevant CRM records — all running locally, all in SQL.

Performance Considerations#

Index Size#

FTS indexes add overhead. A rough rule: expect the index to be 30-50% the size of the raw text being indexed. For a 1GB text dataset, budget 300-500MB for the index.

Rebuild vs. Incremental Updates#

DuckDB FTS indexes are not automatically updated when you insert new rows. You need to rebuild:

PRAGMA drop_fts_index('notes');
PRAGMA create_fts_index('notes', 'id', 'title', 'body');

For append-only workloads (like logs), rebuild on a schedule. For interactive CRM use, rebuild after bulk imports.

Query Performance#

For a table with 1 million rows, a BM25 search typically returns in under 100ms on modern hardware. For 10 million rows, expect 200-500ms. If you need faster, reduce the indexed columns or pre-filter with a WHERE clause before the FTS scan.

Comparing DuckDB FTS to Elasticsearch#

Feature	DuckDB FTS	Elasticsearch
Setup	2 SQL commands	Docker + config + cluster
Relevance algorithm	BM25	BM25 (customizable)
Boolean queries	Yes	Yes
Faceted search	Manual with GROUP BY	Native
Fuzzy matching	Limited	Advanced
Scale	Single machine	Distributed
Cost	Free	$16/GB/month (managed)
SQL integration	Native	Separate API

For most use cases — internal tools, business analytics, CRM search — DuckDB FTS is more than sufficient. Elasticsearch is worth adding when you need fuzzy matching, sophisticated relevance tuning, or distributed scale.

Practical Recipes#

Autocomplete Suggestions#

-- Get unique words from the title column for autocomplete
SELECT DISTINCT unnest(string_split(lower(title), ' ')) AS word
FROM notes
WHERE word LIKE 'rev%'
ORDER BY word
LIMIT 10;

Search with Snippets#

DuckDB doesn't have a built-in snippet function, but you can fake it:

SELECT 
    id,
    title,
    substr(body, 
        max(1, position('revenue' IN lower(body)) - 50), 
        200) AS snippet,
    fts_main_notes.match_bm25(id, 'revenue') AS score
FROM notes
WHERE score IS NOT NULL
ORDER BY score DESC;

Multi-Table Search#

-- Search across both notes and tasks
WITH combined AS (
    SELECT 'note' AS type, id, title AS text, body AS content FROM notes
    UNION ALL
    SELECT 'task' AS type, id, title AS text, description AS content FROM tasks
)
SELECT type, id, text
FROM combined
WHERE lower(content) LIKE '%pipeline%'
   OR lower(text) LIKE '%pipeline%'
ORDER BY type, id;

For proper multi-table BM25, you'd need separate indexes per table and combine results in application code.

When to Use DuckDB FTS vs. Other Approaches#

Use DuckDB FTS when:

Your dataset fits on one machine (under a few hundred GB)
You want SQL-native search without extra infrastructure
You need to combine search with aggregations and filters
You're building internal tools or analytics dashboards

Consider alternatives when:

You need fuzzy matching (typo tolerance) — look at pg_trgm or Meilisearch
You need real-time index updates without rebuilds — SQLite FTS5 has better incremental support
You need distributed search at petabyte scale — Elasticsearch or OpenSearch

For local analytics use cases — which is exactly what DuckDB for business is designed for — the built-in FTS extension handles the vast majority of search requirements.

DuckDB Full-Text Search: Find Anything in Your Data

How DuckDB Full-Text Search Works#

Setting Up Full-Text Search#

Step 1: Load the Extension#

Step 2: Create a Table with Text Data#

Step 3: Build the FTS Index#

Step 4: Search#

Advanced Search Patterns#

Boolean Queries#

Combining FTS with Regular Filters#

Field-Specific Search#

Real-World Use Case: Searching CRM Notes#

Performance Considerations#

Index Size#

Rebuild vs. Incremental Updates#

Query Performance#

Comparing DuckDB FTS to Elasticsearch#

Practical Recipes#

Autocomplete Suggestions#

Search with Snippets#

Multi-Table Search#

When to Use DuckDB FTS vs. Other Approaches#

Frequently Asked Questions#

Does DuckDB FTS support phrase search?#

Can I use DuckDB FTS with multiple languages?#

How do I update the FTS index when data changes?#

Is DuckDB FTS case-sensitive?#

What's the maximum text size DuckDB FTS can handle?#

Keep reading

DuckDB Extensions: What They Are and How to Use Them

DuckDB Backup Strategies for Business Data

Skills Are Files

DuckDB for Financial Modeling

CRM + LLM: What Happens When Your Database Can Think

DuckDB vs BigQuery: When Local Wins