Is DataFrey right for you?
When DataFrey won’t fit
- Database access over MCP without MFA. We use OAuth for MCP. If your policy requires MFA on anything that touches databases, wait until MFA ships.
- Whitelisting our static egress IP. You need to whitelist our IP so we can run queries against your database.
When to restrict DataFrey
DataFrey’s plan tool runs queries to understand your data. To support planning, we index some information about your database internally. When should you limit DataFrey’s privileges?- Every query must be approved. Planning runs LLM-generated read queries with guardrails and without human approval.
- No data can be stored by DataFrey. Planning and indexing store some data about your database.
How we protect you
Your credentials
We store and use your database credentials server-side to run SQL queries. Credentials are encrypted at rest and in transit.Encrypted at client
You provide your database credentials during CLI onboarding. They’re encrypted locally with RSA-OAEP + AES-256-GCM and sent to the server. Only ciphertext crosses the network.
Stored in AWS securely
Credentials are written to AWS Secrets Manager under a dedicated KMS key and removed from memory.
No exposure
Credentials are never exposed in plaintext. For every query, we open a fresh connection — no cache, no connection pooling.
No retention
Running
datafrey db drop removes the secret with no recovery window. The associated index is dropped as well.Your database
During database connection, we guide you through configuring a separate database user with minimal privileges.- Separate user. We ask you to create a dedicated role and user. This lets you fine-tune permissions and rotate or revoke credentials independently.
- Read-only. Only
SELECTis allowed, preventing destructive queries. - Limited access. The user only has access to the tables and schemas you specify.
Your account
The DataFrey API and its clients (MCP Server, CLI) use WorkOS for authorization.API
- Short-lived RS256 JWTs verified against WorkOS JWKs
- Cryptographic tenant isolation. One user cannot access another user’s credentials.
- Identity via WorkOS. SOC2-compliant identity provider. No passwords handled by DataFrey.
CLI
- Browser-based device flow (RFC 8628).
datafrey loginhands sign-in to your browser — your DataFrey account password never touches the CLI. - CLI cannot read your credentials. It’s a thin client for setup and orchestration.
- Short-lived tokens. Access tokens live 1 hour, refresh tokens 30 days.
- Recent-login gate on sensitive actions. Connecting a database requires a fresh authentication.
- Tokens stored in the OS keyring. macOS Keychain / GNOME Keyring / Windows Credential Locker via the keyring library — the CLI refuses to run against plaintext or null backends.
- One-command revocation.
datafrey logoutwipes both tokens from the keyring immediately.
MCP
- OAuth 2.1 with WorkOS. Auth follows the latest MCP spec — PKCE-protected flow delegated to WorkOS, short-lived (1h) RS256 JWTs validated against WorkOS JWKS on every request. No passwords or long-lived secrets.
- Thin, auditable bridge. The MCP server is open-source and makes no DB connections, LLM calls, or logic — it only terminates OAuth and forwards your signed JWT to the DataFrey backend.
What we store and query
What we store
To plan queries, we build an index of your database. You can limit what’s indexed or opt out. The index may contain:- Schema — tables, columns, and types from
information_schema. - Aggregations and statistics computed from your data.
- Sample values from your columns. We don’t redact the values.
- Request metadata across API, MCP, and CLI.
- The SQL text of
runcalls. - The natural-language questions you send to
plan. - For
planonly: the internal queries the agent issues, their results, and the final answer.
- Result rows returned by
run. - Credentials.
What we query
Every query uses your read-only database user and passes the regex guardrail.- MCP queries. Your MCP client generates SQL and calls the
runtool. Most clients ask you to approve each call. - Plan queries. When you call
plan, it runs whatever read queries it needs to answer your question. Individual queries are not surfaced for approval. - Index queries. Building the index runs read queries against your database.
planand the index work together, so callingplanmay index additional data along the way.
Controls for sensitive data
Opt out of indexing (all data sensitive)
Use this if: you don’t want DataFrey to store any schema or sample data, or send it to OpenAI. Only your credentials are kept.Benefit: no schema or samples stored; nothing sent to OpenAI.
During Onboarding
During Onboarding
Answer “No” when
datafrey prompts you to build the index.After Onboarding
After Onboarding
Restrict access (some data sensitive)
Use this if: you wantplan to work, but some tables or columns
must never be indexed or sent to OpenAI. Enforce this at the
database-user level so DataFrey physically cannot read them.
Benefit: schema and samples for sensitive tables never leave your database.
Remove access to sensitive data
Remove access to sensitive data
Revoke the DataFrey role’s access to sensitive tables or schemas, then re-sync the index.For column-level protection, apply a masking policy to the sensitive columns.After revoking, run
datafrey index to drop stale entries.Give access only to information schema
Give access only to information schema
Restrict
DATAFREY_ROLE to INFORMATION_SCHEMA only. The indexer collects table names, column names, and data types from the catalog, but cannot sample any application data.INFORMATION_SCHEMA is readable by any role with USAGE on the database — no further grants are needed.Subprocessors
Third-party services that may receive data from DataFrey.Amazon Web Services
Amazon Web Services
Cloud infrastructure provider. All DataFrey services run on AWS in
us-east-1 — API, index, and credential storage. aws.amazon.comWorkOS
WorkOS
Identity provider. Handles authentication across DataFrey (API, MCP, CLI). SOC 2–compliant and trusted by OpenAI, Anthropic, and Snowflake. workos.com
OpenAI
OpenAI
LLM provider. The agent may send database information (schema, aggregations, sample values) during
plan and indexing. We don’t currently have Zero Data Retention; opt out of the index to stop any data from reaching OpenAI. openai.com · data privacy · trust portalReporting a vulnerability
Emailslava+security@datafrey.ai. We aim to acknowledge reports within 48 hours.
Feedback
Tell us what's missing
Security is active work — MFA and Zero Data Retention are next, with compliance work to follow. Tell us what you need next.