Creating an End-to-End PDF Chat Application Using Google Generative AI

 Hello everyone! Welcome to my blog where we delve into creating a fascinating end-to-end project: a chat application that interacts with multiple PDF documents using Google Generative AI. This application will leverage LangChain and various vector embedding techniques, including one developed by Facebook. Let's dive in step-by-step to build this application from scratch.


Project Overview

The application allows users to chat with multiple PDF documents. Users can upload multiple PDFs, and the application will convert these PDFs into vector embeddings. These embeddings are stored locally (or in a database). Users can then ask questions related to the content of these PDFs, and the application will retrieve and provide detailed answers.

Tools and Libraries

  1. Streamlit: For the web interface.
  2. Google Generative AI: For generating text responses.
  3. LangChain: For handling and processing PDFs.
  4. FAISS: For storing and retrieving vector embeddings.
  5. PyPDF2: For reading PDF content.
  6. dotenv: For handling environment variables.

Step-by-Step Implementation

1. Setting Up the Environment

First, create a virtual environment and activate it:

conda create -n pdf_chat_env python=3.10
conda activate pdf_chat_env

Then, create a requirements.txt file with the following content:

streamlit
google-generativeai
python-dotenv
langchain
PyPDF2
chromadb
faiss-cpu
langchain_google_genai

Install the dependencies:

pip install -r requirements.txt

2. Setting Up the Project Structure

Create a file named multiple_chatpdf.py and add the following code:

import streamlit as st
from PyPDF2 import PdfReader
from langchain.text_splitter import RecursiveCharacterTextSplitter
import os
from langchain_google_genai import GoogleGenerativeAIEmbeddings
import google.generativeai as genai
from langchain.vectorstores import FAISS
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.chains.question_answering import load_qa_chain
from langchain.prompts import PromptTemplate
from dotenv import load_dotenv

# Load environment variables
load_dotenv()
genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))

# Function to extract text from PDFs
def get_pdf_text(pdf_docs):
    text = ""
    for pdf in pdf_docs:
        pdf_reader = PdfReader(pdf)
        for page in pdf_reader.pages:
            text += page.extract_text()
    return text

# Function to split text into chunks
def get_text_chunks(text):
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=10000, chunk_overlap=1000)
    chunks = text_splitter.split_text(text)
    return chunks

# Function to generate vector store
def get_vector_store(text_chunks):
    embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
    vector_store = FAISS.from_texts(text_chunks, embedding=embeddings)
    vector_store.save_local("faiss_index")

# Function to create conversational chain
def get_conversational_chain():
    prompt_template = """
    Answer the question as detailed as possible from the provided context, make sure to provide all the details, if the answer is not in
    provided context just say, "answer is not available in the context", don't provide the wrong answer\n\n
    Context:\n {context}?\n
    Question: \n{question}\n

    Answer:
    """

    model = ChatGoogleGenerativeAI(model="gemini-pro", temperature=0.3)
    prompt = PromptTemplate(template=prompt_template, input_variables=["context", "question"])
    chain = load_qa_chain(model, chain_type="stuff", prompt=prompt)

    return chain

# Function to handle user input
def user_input(user_question):
    embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
    new_db = FAISS.load_local("faiss_index", embeddings)
    docs = new_db.similarity_search(user_question)
    chain = get_conversational_chain()
    response = chain({"input_documents": docs, "question": user_question}, return_only_outputs=True)
    st.write("Reply: ", response["output_text"])

# Main function to create Streamlit app
def main():
    st.set_page_config("Chat PDF")
    st.header("Chat with PDF using Gemini💁")

    user_question = st.text_input("Ask a Question from the PDF Files")
    if user_question:
        user_input(user_question)

    with st.sidebar:
        st.title("Menu:")
        pdf_docs = st.file_uploader("Upload your PDF Files and Click on the Submit & Process Button", accept_multiple_files=True)
        if st.button("Submit & Process"):
            with st.spinner("Processing..."):
                raw_text = get_pdf_text(pdf_docs)
                text_chunks = get_text_chunks(raw_text)
                get_vector_store(text_chunks)
                st.success("Done")

if __name__ == "__main__":
    main()

Explanation of the Code

  1. Environment Setup:

    • Load environment variables using load_dotenv().
    • Configure the Google Generative AI API key.
  2. PDF Text Extraction:

    • get_pdf_text(pdf_docs): Reads the content from PDF files.
  3. Text Chunking:

    • get_text_chunks(text): Splits the extracted text into smaller chunks for better processing.
  4. Vector Store:

    • get_vector_store(text_chunks): Converts text chunks into vector embeddings and stores them locally using FAISS.
  5. Conversational Chain:

    • get_conversational_chain(): Sets up the model and prompt template for generating responses.
  6. User Input Handling:

    • user_input(user_question): Handles the user query, performs similarity search on the vector store, and generates a response.
  7. Streamlit App:

    • main(): Defines the Streamlit interface for uploading PDFs, processing them, and asking questions.

Running the Application

To start the application, run the following command:

streamlit run multiple_chatpdf.py

This will open a web interface where you can upload PDF files, process them, and interact with them by asking questions.

Conclusion

Congratulations! You've built a complete end-to-end application that allows users to chat with multiple PDF documents using Google Generative AI. This project showcases the power of combining various tools and libraries to create a functional and interactive application.

If you have any questions or suggestions, feel free to leave a comment. Happy coding!

Comments

Popular posts from this blog

Exploring the Features of ChatLLM Teams by Abacus.AI

Step-by-Step Crew AI: Turn YouTube Videos into Blog Gems

How to Create Free API Keys for Google Gemini Models