Building a Text-to-SQL Query Application Using Generative AI


 In this blog post, we will create an exciting end-to-end project: an application that converts natural language text into SQL queries and retrieves data from a SQL database. We'll be using a powerful Generative AI model to achieve this. The project will be implemented from scratch, and by the end, you'll have a functional application capable of understanding and executing SQL queries based on user input.

Project Overview

  1. Setting Up the Environment
  2. Creating the Requirements File
  3. Writing the Code Step-by-Step
  4. Additional Improvements and Considerations


Imagine having a student database and asking the application to retrieve all students' names or filter students based on specific criteria, such as marks greater than 85. The application will convert these natural language queries into SQL queries, execute them on the database, and return the results.

Setting Up the Environment

First, we need to set up a virtual environment to manage our project dependencies.

conda create -n text_to_sql_env python=3.10
conda activate text_to_sql_env

Next, create a .env file to store your API key:

GOOGLE_API_KEY=your_api_key_here

Creating the Requirements File

Create a requirements.txt file with the following content:

streamlit
google-generativeai
python-dotenv
sqlite3

Install the dependencies:

pip install -r requirements.txt

Writing the Code Step-by-Step

Now, let’s start writing the code. Create a file named app.py and add the following code:

Import Libraries and Load Environment Variables

import streamlit as st
from dotenv import load_dotenv
import os
import sqlite3
import google.generativeai as genai

# Load environment variables from .env file
load_dotenv()
genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))

Function to Load the Generative AI Model and Get Responses

def get_gemini_response(question, prompt):
    model = genai.GenerativeModel('gemini-pro')
    response = model.generate_content([prompt[0], question])
    return response.text

Function to Retrieve Query from the Database

def read_sql_query(sql, db):
    conn = sqlite3.connect(db)
    cur = conn.cursor()
    cur.execute(sql)
    rows = cur.fetchall()
    conn.commit()
    conn.close()
    for row in rows:
        print(row)
    return rows

Define Your Prompt

This prompt will instruct the AI on how to behave and convert natural language questions into SQL queries.

prompt = [
    """
    You are an expert in converting English questions to SQL query!
    The SQL database has the name STUDENT and has the following columns - NAME, CLASS, 
    SECTION, and MARKS. For example,
    Example 1 - How many entries of records are present?, 
    the SQL command will be something like this: SELECT COUNT(*) FROM STUDENT;
    Example 2 - Tell me all the students studying in Data Science class?, 
    the SQL command will be something like this: SELECT * FROM STUDENT WHERE CLASS="Data Science";
    The SQL code should not have ``` in the beginning or end and SQL word in the output.
    """
]

Streamlit App Setup

# Initialize Streamlit app
st.set_page_config(page_title="I can Retrieve Any SQL Query")
st.header("Gemini App To Retrieve SQL Data")

# User input and submit button
question = st.text_input("Input: ", key="input")
submit = st.button("Ask the question")

# If submit is clicked
if submit:
    response = get_gemini_response(question, prompt)
    print(response)
    response = read_sql_query(response, "student.db")
    st.subheader("The Response is")
    for row in response:
        print(row)
        st.header(row)

Setting Up the Database

Create another file named setup_db.py to set up the SQLite database and insert some records.

import sqlite3

# Connect to SQLite
connection = sqlite3.connect("student.db")

# Create a cursor object to insert records and create table
cursor = connection.cursor()

# Create the table
table_info = """
CREATE TABLE STUDENT(NAME VARCHAR(25), CLASS VARCHAR(25), SECTION VARCHAR(25), MARKS INT);
"""
cursor.execute(table_info)

# Insert some records
cursor.execute('''INSERT INTO STUDENT VALUES('Krish', 'Data Science', 'A', 90)''')
cursor.execute('''INSERT INTO STUDENT VALUES('Sudhanshu', 'Data Science', 'B', 100)''')
cursor.execute('''INSERT INTO STUDENT VALUES('Darius', 'Data Science', 'A', 86)''')
cursor.execute('''INSERT INTO STUDENT VALUES('Vikash', 'DEVOPS', 'A', 50)''')
cursor.execute('''INSERT INTO STUDENT VALUES('Dipesh', 'DEVOPS', 'A', 35)''')

# Display all the records
print("The inserted records are")
data = cursor.execute('''SELECT * FROM STUDENT''')
for row in data:
    print(row)

# Commit your changes in the database
connection.commit()
connection.close()

Run the setup_db.py file to set up your database:

python setup_db.py

Running the Application

To start the application, run the following command:

streamlit run app.py

This will open a web interface where you can input natural language queries, and the application will convert them into SQL queries, execute them, and display the results.

Additional Improvements and Considerations

While the basic functionality is now in place, there are several improvements and additional features you could consider:

  1. Error Handling: Add more robust error handling to manage issues like unsupported file types or missing API keys.
  2. Advanced Query Capabilities: Enhance the app to handle more complex queries and different SQL operations.
  3. UI/UX Enhancements: Improve the user interface for better user experience, perhaps by adding more descriptive labels or progress indicators.
  4. Security Measures: Implement security measures to prevent SQL injection and other potential vulnerabilities.

Conclusion

Congratulations! You've built a complete end-to-end text-to-SQL query application using Generative AI. This project showcases how powerful AI can be in automating the translation of natural language into executable SQL queries, making it a valuable tool for interacting with databases.

If you have any questions or suggestions, feel free to leave a comment. Happy coding!

Comments

Popular posts from this blog

Exploring the Features of ChatLLM Teams by Abacus.AI

Step-by-Step Crew AI: Turn YouTube Videos into Blog Gems

How to Create Free API Keys for Google Gemini Models