Planning - DataFrey

Planning is what lets you ask questions like “top 10 customers by revenue last quarter” and get correct SQL — without spelling out table names and joins. To make that work, DataFrey needs a rough map of your database. You run datafrey index once, and from then on complex questions use your real schema instead of a guess. Re-run it whenever your schema changes — it’s manual, never automatic.

Planning and indexing send schema, aggregations, and sample values to an LLM and store them server-side. If that’s a problem for your data, see Security for what’s stored and how to opt out.

Prerequisite: you’ve completed initial setup and have a connected database.

How planning works

Two pieces make planning work: the plan (produced per question) and the index (built once, consulted every time).

The plan

A plan is an LLM-generated description of how DataFrey intends to answer your question — which tables and columns to touch, how to join them, what filters and aggregations to apply. You see the plan and the SQL before the result.

The index

The index is a server-side summary of your database that the planner consults before writing SQL. Without it, the planner has no schema to work from, and planning is disabled. It may contain:

Schema — tables, columns, and types from information_schema.
Aggregations and statistics computed from your data.
Sample values from your columns. We don’t redact the values.

It’s a summary, not a copy of your data. For the exact contents and retention, see Security — What we store.

Building the index

datafrey index

Indexing runs server-side. Hard limits are listed in Limits & Prerequisites. To check whether a sync is in progress, when the index was last built, and how many tables and columns it covers:

datafrey status

Keeping the index fresh

Indexing is manual — DataFrey never refreshes it in the background. Re-run datafrey index yourself whenever:

You add a new table or view
You add or remove columns
A large backfill or similar change shifts data

You don’t need to re-sync for regular inserts, updates, or deletes.

Add a new table and ask about it? The planner won’t see it until you re-run datafrey index.

When planning runs

Your AI client (Claude Code, Cursor, etc.) decides when to call the plan tool. Trivial questions go straight to run; ambiguous ones go through plan first.

Your question	What the client does	Why
`list all tables`	Calls `run` directly	Trivial, no schema reasoning needed
`count rows in orders`	Calls `run` directly	Table name is explicit
`top 10 customers by revenue last quarter`	Calls `plan` first	Needs to find the right tables, joins, and date column
`which products have declining sales MoM`	Calls `plan` first	Ambiguous — planner picks the right metric and grain

If the client skips planning on a question that needs it, force it explicitly:

/db use the plan tool first, then answer: top 10 customers by revenue last quarter

Opting out

See Security — Controls for sensitive data for full opt-out and partial-restriction options.

Documentation Index

​How planning works

​The plan

​The index

​Building the index

​Keeping the index fresh

​When planning runs

​Opting out

How planning works

The plan

The index

Building the index

Keeping the index fresh

When planning runs

Opting out