Natural Language SQL Query Engine is an intelligent analytics system that converts plain English business questions into safe, executable SQL queries and automatically generates insights with visualizations.
Instead of writing SQL manually, users can simply ask:
- "Show sales by region"
- "Top products by revenue"
- "Average order value"
The system interprets intent, generates secure SQL queries, executes them on the database, and displays structured results with charts.
In many organizations, non-technical stakeholders cannot write SQL queries. They depend on data teams for even basic insights, leading to:
- Slower decision-making
- Analyst bottlenecks
- Reporting delays
- Communication gaps
There is a need for a self-service analytics interface where business users can interact with data directly using natural language.
Build a secure and reliable Natural Language to SQL engine that:
- Accepts business questions in English
- Extracts intent, metric, and dimensions
- Generates structured SQL queries
- Validates queries for safety
- Executes them on the database
- Automatically visualizes results
All without requiring SQL knowledge.
- User enters question
- NLP logic detects:
- Intent (aggregation, ranking, trend, average)
- Metric (sales, revenue, quantity)
- Dimension (region, product, date)
- Filters (time conditions)
- SQL query is generated
- Validator checks for unsafe commands
- Database executes safe query
- Results are returned
- Automatic visualization is created
User Question
|
v
Intent Detection (Rule-based NLP)
|
v
SQL Query Generator
|
v
Query Validator
| |
Safe Unsafe
| |
v v
Execute Query Reject
|
v
Fetch Results
|
v
Auto Visualization
|
v
Display
Natural-Language-SQL-Query-Engine/
│
├── app.py # Streamlit user interface
├── ai_sql_generator.py # NLP-based SQL generation logic
├── validator.py # Query security validation
├── snowflake_connector.py # Database connection & execution
├── visualization.py # Automatic chart selection logic
├── schema.sql # Sample dataset schema
├── requirements.txt
└── README.md
This project uses structured rule-based NLP instead of uncontrolled LLM outputs.
It extracts:
- Intent → aggregation / ranking / trend
- Metric → revenue / sales / quantity
- Dimension → region / product / date
- Time filters → last month / this year
Then constructs a valid SQL query using predefined mappings.
This ensures:
- Accuracy
- Control
- Predictable behavior
- Enterprise reliability
User Question:
How many orders has each customer placed?
Generated SQL:
SELECT
C.CUSTOMERNAME,
COUNT(O.ORDERID) AS NumberOfOrders
FROM CUSTOMERS AS C
LEFT JOIN ORDERS AS O
ON C.CUSTOMERID = O.CUSTOMERID
GROUP BY
C.CUSTOMERNAME
ORDER BY
C.CUSTOMERNAME;The query validator blocks destructive operations such as:
- DELETE
- DROP
- UPDATE
- ALTER
- INSERT
Only safe SELECT queries are executed.
The system automatically selects chart type based on result structure:
| Result Type | Visualization |
|---|---|
| Category + Value | Bar Chart |
| Date + Value | Line Chart |
| Single Metric | KPI Display |
No manual plotting required.
- Python
- Streamlit
- Pandas
- Plotly
- Snowflake Connector
pip install -r requirements.txtstreamlit run app.py- total sales
- sales by region
- top regions by revenue
- average order value
- sales trend over time
This project enables:
- Self-service analytics
- Faster insight generation
- Reduced analyst dependency
- Real-time decision-making
It functions like a lightweight BI tool powered by natural language input.
- Multi-table joins
- CSV upload support
- Role-based access control
- Natural language explanation of results
- Session memory for saved insights
Natural Language SQL Query Engine demonstrates how NLP, SQL generation, validation logic, and visualization can be combined to create a secure and intelligent analytics assistant for business users.