Before You Build an AI SQL Agent, Build the Safety Layer

Natural language to SQL sounds simple. You type a question, the model generates a query, the database returns an answer. Clean. Fast. Powerful.

And genuinely dangerous if you don't design the safety architecture before you design the feature.

A client recently asked me: "We're building an AI agent that can execute SQL queries against our production database based on natural language questions from users. How should we architect this?"

My answer started with the constraints. Not the capability. Because here's what I've found in practice: the constraints are the design. If you bolt safety on afterward, you'll spend months undoing decisions that felt reasonable at the time.

Six Non-Negotiable Safety Controls

Before any code gets written, these six controls need to be settled. Not negotiated. Not "we'll add them later." Settled.

1. Read-only access against a replica. There is almost no scenario where an LLM-generated SQL query should have write access to a production database. If users need to modify data, that goes through a separate, human-approved workflow. Not through the agent. A replica also protects you from a runaway query taking down your live system.

2. Query allowlisting. Don't let the model generate arbitrary SQL. Constrain the entire surface area:

Allowed operations: SELECT only
Allowed tables: an explicit list. The agent should never touch users, passwords, auth_tokens, or any PII-heavy table unless your access controls are airtight
Allowed joins: define which table relationships are permitted, and block everything else

3. Result size limits. Cap results at a reasonable ceiling. Say, 1,000 rows. An LLM can accidentally generate SELECT * FROM transactions on a 50-million-row table and take down your replica before anyone notices. Hard limits aren't optional.

4. A programmatic query validation layer. This is the critical safety gate, and it needs to be deterministic code, not another LLM call. Before execution, parse the generated SQL and check it against your allowlist in code. This is not "ask the model if this query looks safe." This is a rule-based check that either passes or blocks the query. Those are two very different things.

5. Audit logging. Log everything: the natural language input, the generated SQL, the user who requested it, and the result set size. Every time. This is non-negotiable for compliance and for debugging when something unexpected happens. And something will.

6. Sandboxed execution with a dedicated service account. The database connection should use a purpose-built service account with the minimum permissions required. Not your application's main credentials. Scope it down to exactly what the agent needs, and nothing more.

The Architecture

With those controls in place, here's the flow I'd recommend:

User question
    |
[Intent classification]
    Is this a data question? Or a general question?
    |
    +-- General question --> Answer directly, skip SQL pipeline
    |
    +-- Data question
          |
    [SQL generation]
          Model generates SQL from natural language
          |
    [Query validator]
          Programmatic check: allowed tables, operations, joins
          |
          +-- Fails validation --> Return error to user
          |
          +-- Passes validation
                |
    [Execution engine]
          Read-only replica, dedicated service account, row limit
                |
    [Result formatter]
          Model summarizes results in natural language
                |
    User response

Two steps deserve special attention.

Intent classification. You don't want every user message routing into the SQL pipeline. A question like "what does this field mean?" doesn't need a database query. Classifying intent upfront keeps the system efficient and reduces unnecessary exposure.

Query validator. This is the gate that everything else depends on. If the generated SQL doesn't pass a deterministic code check against your allowlist, it doesn't execute. Full stop. No exceptions. No "well, it looks probably fine."

Crawl, Walk, Run

I used to think you could prototype this kind of feature loosely and harden it once you understood the use cases better. In practice, the opposite is true. The use cases you don't anticipate are the ones that will expose you.

So start narrow. Pick two or three specific business questions your users actually need to answer. Build the allowlist around exactly those queries. Validate obsessively. Log everything. Expand only after you've seen the system behave under real usage.

Crawl: A small allowlist, two or three supported query patterns, and manual review of every generated SQL in staging before production.
Walk: Expand the allowlist based on validated use cases. Add monitoring and alerting on query patterns and result sizes.
Run: Broader query support, user-level access controls, and automated anomaly detection on the audit log.

Constraints First, Then Capability

The capability is real. The risks are real. Most teams focus on getting the SQL generation right and treat the safety layer as an afterthought. That's backwards.

Design the constraints first. Then build the feature inside them. The system you end up with will be smaller, safer, and more trustworthy than one you tried to lock down after the fact.

What's the most constrained version of this you could ship first? Start there.

Six Non-Negotiable Safety Controls

The Architecture

Crawl, Walk, Run

Constraints First, Then Capability

Related Posts