Multi-Agent AI for SQL Databases: Turning Natural Language into Business Insights

Following on from Veeam Built the AI Revolution While Others Prepared, this post shares a specific project: how I built AI agents that can transform natural language questions into intelligent database interactions.

Key Takeaways:

SQL AI Agents make databases conversational: Multi-agent systems translate plain-English business questions into accurate SQL queries, bridging the gap between technical teams and business users.
Multi-agent architecture solves LLM weaknesses: Instead of relying on a single model, specialized agents (planning, SQL writing, review, correction, visualization, PII redaction) collaborate to improve accuracy, error recovery, and compliance.
Privacy is built in, not bolted on: PII/PHI detection and real-time redaction ensure sensitive data can be queried and analyzed without violating compliance requirements.
Database agnostic by design: The system works across SQL and Postgres using schema introspection and modular prompts. This is ideal for use with Instant Database Recovery or via flat file databases such as SQLite exposed from backups by the Data Integration API.
Enterprise-ready with transparency: Every response includes full workflow traces, visualizations, and explanations, ensuring visibility for business users and debugging clarity for technical teams.

The Problem I Wanted to Solve

Much of my focus on unlocking value from enterprise data has been on unstructured content, including documents, emails, and various file formats. But there’s immense untapped value in our structured enterprise databases, both current and historical. The challenge is the data’s accessibility. Business users need insights but lack SQL expertise, while technical teams understand databases but may miss the business context behind the questions.

Working with Veeam’s capabilities, I realized we had something powerful. Through instant database recovery, we can present complete SQL Server databases directly from backup storage without data copying. For file-based databases, the Data Integration API can present datasets instantly over the network. This creates an interesting opportunity: querying decades of business data as easily as accessing today’s production systems.

Along the way, I began thinking about compliance and regulatory requirements. Database data can often contain sensitive information like personally identifiable information (PII) or protected health information (PHI). I wanted to ensure the system could handle the identification and redaction of this in real time using an AI agent. All of this had to come together in a cohesive interface that could handle both the complexity of database interactions and intelligent privacy protection. So, I decided to build one.

My Approach: Why I Chose Multi-Agent Architecture

When I started building this, I quickly realized that asking a single large language model to handle everything, from understanding intent to generating SQL to validating results, was too fragile. I’d get syntax errors, hallucinated table names, or queries that were technically correct but missed the user’s actual intent.

So, I broke down the problem differently. Instead of one AI trying to do everything, I built a team of specialized agents, each focused on a specific aspect of the problem. It’s similar to how I’d approach a complex data question in real life. I’d think through what I’m trying to achieve, write the SQL carefully, check if it worked, and fix any issues.

How I Designed the Architecture

The key insight I had was that orchestration matters more than individual AI capability when dealing with complex, error-prone tasks. I built a central workflow that manages specialized agents, and each agent communicates through structured JSON outputs. This lets the system make intelligent decisions and self-correct when things go wrong.

What I discovered is that when you break down database interaction into specific roles, you can optimize each agent’s prompt and behavior much more effectively than trying to make one agent handle everything. Plus, when something fails, I can see exactly where and why, rather than getting a generic “something went wrong” response.

The Agents I Built

Each agent I created handles a specific aspect of turning business questions into database insights. Here’s how I approached each one and what I learned:

Planning Agent – I needed something to figure out what users actually want before diving into SQL. I prompt this agent to classify requests into specific types like “query_data” or “visualize_data” and output structured plans.

What I found is that getting this initial classification right makes everything downstream much more reliable.

SQL Writing Agent – I give this agent the specific database type and database-specific syntax examples. Knowing whether it’s working with PostgreSQL, SQL Server, or SQLite is critical for reducing retries and getting SQL syntax correct more often.

The key insight was making it learn from feedback. When a query fails, it gets specific guidance on what went wrong and can adjust its approach accordingly.

Review Agent – This checks whether the SQL actually answered the user’s question.

This agent looks at the results and decides if we succeeded or need to try again. When it says “retry,” it provides specific feedback that the SQL Writing Agent can actually use.

Table and Column Correction Agents – These were born out of frustration with typos and natural language ambiguity breaking everything. When someone types “companys” instead of “companies,” these agents kick in to suggest the right name based on what’s actually available. But they go deeper than simple typo correction —they handle semantic mapping too.

For example, when a user asks for “id of the customer” and the actual database field is “customer_id” or “CustomerID,” these agents use the available schema to suggest the correct column name. It’s a small thing, but it makes the system much more forgiving and bridges the gap between how people naturally describe data and how it’s actually stored in databases.

PII Redaction Agent – As I started thinking about real enterprise data, privacy protection couldn’t be an afterthought. This agent scans results for personally identifiable information and creates mappings for redaction. The challenge was preserving the usefulness of data while protecting sensitive information.

Visualization Planner Agent – Rather than guessing at chart types, I built this to analyze the actual data and determine what visualization makes sense.

It considers the data characteristics and outputs structured plans for creating meaningful charts.

Visualization Writer Agent – This one bridges the gap between raw charts and business insights, explaining what the data actually means in plain language.

The Tools I Built to Support the Agents

While the agents provide the intelligence, I needed several reliable tools to handle the actual database work and privacy protection:

Schema Inspector – This was crucial for making the system work with any database. Instead of hardcoding table structures, I use SQLAlchemy’s introspection to dynamically discover what’s actually in each database. It queries the database directly to get current table names, columns, and data types.

This means the same system works whether you’re connecting to SQLite, PostgreSQL, or SQL Server.

SQL Executor – I built this to safely run the generated SQL and return results in a format the agents can work with. The tricky part was handling complex data types like datetime objects properly, making sure everything serializes to JSON without breaking.

Visualization Generator – This takes the structured plans from the agents and actually creates the charts using matplotlib. I save the images to a designated folder and return URLs so they can be easily embedded in applications.

PII Anonymizer – I integrated Microsoft Presidio for the actual redaction work. This tool takes the mappings from the PII Redaction Agent and intelligently replaces sensitive information with appropriate placeholders, keeping the data useful for analysis while protecting privacy.

How I Orchestrated Everything Together

Getting all these agents to work together was the real challenge.

Here’s the workflow I designed and what I learned along the way:

1. Schema Discovery – I start every query by getting a lightweight scan of available tables. This gives the agents context about what they’re working with before they start making assumptions.

2. Strategic Planning – The Planning Agent looks at the user’s question alongside the table list and figures out what approach to take. This early classification prevents a lot of downstream confusion.

3. Smart Execution with Error Recovery – This is where I spent most of my time getting right:

For Data Queries: I built a retry loop where the SQL Writing Agent generates queries, and when they fail, I immediately bring in specialized correction agents. If it’s a table name error, the Table Correction Agent suggests fixes.

For column errors, the Column Correction Agent looks at the specific table schema. This targeted approach fixes fundamental issues quickly instead of just retrying the same broken query.

For Visualizations: I use the same error-resistant approach to get the data first, then the Visualization Planner figures out the best way to present it.

4. Privacy Protection – When I enable sensitive data protection, the PII Redaction Agent scans results automatically. The PII Anonymizer then handles the actual redaction. What I found interesting is that you can preserve most of the insights while protecting individual privacy.

5. Complete Transparency – Every response includes a detailed trace of what happened. “Black box” AI systems don’t work well in enterprise settings. People need to understand and trust what the system is doing.

Built for Adaptability: Database-Agnostic Design

Rather than hardcoding database-specific knowledge, the system dynamically adapts to any SQL database through runtime schema discovery. Agent prompts are designed to work with provided context rather than assumptions, making the system highly extensible and maintainable. This means the same agents that work with a SQLite database can seamlessly handle PostgreSQL, SQL Server, or MySQL without modification. The intelligence adapts to the data, not the other way around.

Enterprise Integration: API-Driven Intelligence

The multi-agent system exposes a simple API endpoint that accepts natural language queries and returns structured responses including SQL results, visualizations, and complete workflow traces.

This API-first design allows the database intelligence capabilities to be integrated into existing business applications, dashboards, custom interfaces, or automation workflows—extending this natural language database superpower across the entire enterprise ecosystem.

Transparent Results: Complete Visibility into AI Decision-Making

To ensure the system isn’t a “black box,” every API response provides comprehensive detail beyond just the final result. The response includes SQL queries, data results, visualizations, and explanations alongside a complete workflow trace that logs every significant step the system took.

This workflow trace is a structured record of agent interactions, including their inputs, outputs, and decision points.

For business users, it provides confidence in the system’s reasoning.

For technical teams, it offers invaluable debugging capabilities and insight into how the multi-agent coordination actually works.

This transparency builds trust while enabling continuous improvement of the system’s performance.

Generic by Design

The overarching theme of this project is genericity.

Database Agnostic: By dynamically inspecting the database schema using SQLAlchemy’s introspection capabilities, the system can adapt to any underlying SQL database (SQLite, MySQL, PostgreSQL, MSSQL, etc.) without needing code changes. The agents are prompted with the actual schema, not a hardcoded one.

Adaptive Prompts: The system prompts for each agent are designed to be generic, focusing on the agent’s role and the structured input/output. They don’t contain specific table or column names but rather instruct the agents to use the provided dynamic context (like database_schema or sql_result ) to formulate their responses.

Modular Agents: The separation into specialized agents (planning, writing, reviewing, visualizing) means each component can be independently improved or swapped out. This modularity makes the system highly extensible and maintainable.

From Concept to Reality: Compliant Enterprise Data Intelligence

When integrated with modern data management platforms, this AI agent system transforms both data accessibility and privacy protection.

The workflow becomes elegantly powerful:

1. Instant Access: Historical databases become available directly from backup storage

2. Natural Language Query: Business users describe their needs in plain English

3. Intelligent Translation: The multi-agent system converts intent to precise SQL with error recovery

4. Smart Execution: Queries run against instantly-mounted historical data with built-in resilience

5. Privacy Protection: Sensitive data is automatically detected and redacted while preserving analytical value

6. Business-Ready Results: Insights are delivered with visualizations, clear explanations, and complete transparency

This approach eliminates traditional barriers between business questions and database answers while ensuring compliance with privacy regulations. Decisionmakers can now explore decades of enterprise intelligence—including sensitive historical data—with confidence that personally identifiable information is automatically protected without losing the analytical insights they need.

What I Discovered About AI Agents and Data

Working on this project showed me that we’re at an interesting point where AI agents and large language models are mature enough to handle real enterprise challenges, including the sophisticated privacy requirements that come with sensitive data. Building this SQL AI Agent taught me how specialized AI can unlock decades of structured data—including sensitive historical information that was previously locked away in backups—and make it accessible through simple conversations.

What really excites me is combining this with platforms like Veeam that can present historical databases instantly, plus privacy technologies that automatically protect sensitive information while keeping the useful data intact. We’re not just building better tools—we’re changing what’s possible when you can ask any question about your organization’s entire data history and get compliant, trustworthy answers.

The technology pieces are all coming together. The multi-agent architecture works. The privacy protection is built-in. Now it’s about figuring out what questions to ask and what insights are waiting to be discovered.

What would you want to know about your organization’s data if you could ask anything, knowing that sensitive information would be automatically protected?

Ready to strengthen your own database resilience? Explore the Veeam Data Platform for Backup and Recovery to see how you can protect, manage, and instantly recover your SQL workloads — with enterprise-grade security and simplicity.

The post Multi-Agent AI for SQL Databases: Turning Natural Language into Business Insights appeared first on Veeam Software Official Blog.

from Veeam Software Official Blog https://ift.tt/WYDa5hM

Share this content: