Crew AI & Google Gemini: Real-World AI Solutions at Work

 

Hello everyone! In this blog post, we'll delve into implementing multiple AI agents for real-world use cases using Crew AI and the Google Gemini model. While many examples often rely on paid models, we'll focus on using Google Gemini, a free and accessible model suitable for our needs.

Why Google Gemini?

Crew AI's default agent framework typically requires an OpenAI API key, which is a paid service. However, for those who prefer open-source or free solutions, Google Gemini is an excellent alternative. Google Gemini allows 60 queries per minute, making it a robust and accessible option for many use cases.

Components of Crew AI

Crew AI consists of three main components:

  1. Agents: These are specialized individuals (or AI models) with specific roles and expertise, such as a data scientist, content writer, or researcher.
  2. Tasks: Each agent is assigned specific tasks related to their role.
  3. Tools: Tools are third-party APIs or internal tools that agents use to complete their tasks efficiently.

Setting Up the Environment

First, let's set up our coding environment. We'll create a virtual environment and install the necessary libraries.

Step 1: Setting Up the Virtual Environment


Before diving into the code, it's crucial to set up a virtual environment. This ensures that all dependencies are managed correctly and do not interfere with other projects.

1.1 Installing Anaconda

If you haven't installed Anaconda yet, download and install it from Anaconda's official website.

1.2 Creating and Activating the Virtual Environment

  1. Activate Conda: Open your terminal or command prompt and ensure Conda is activated.
  2. Create a Virtual Environment:
    conda create -p venv python=3.10
  3. Activate the Virtual Environment:
    conda activate venv


Step 2: Installing Dependencies

Create a requirements.txt file with the following content:

crewai
langchain_google_genai
load_dotenv
crewai_tools

Then, install these dependencies:

pip install -r requirements.txt

Step 3: Setting Up Environment Variables

Create an .env file to store your API keys:

GOOGLE_API_KEY=your_google_api_key
SERPER_API_KEY=your_serp_api_key

Implementing the AI Agents

We'll create two main components: agents and tasks. The agents will be responsible for specific roles, and tasks will define what each agent needs to do.


Step 4: Creating Agents

Create a file named agents.py and define our agents:

from crewai import Agent
from tools import tool
from dotenv import load_dotenv
load_dotenv()
from langchain_google_genai import ChatGoogleGenerativeAI
import os

# Securely loading API key from environment variables
google_api_key = os.getenv("GOOGLE_API_KEY")

# Initializing the Google Gemini LLM with bespoke parameters
llm = ChatGoogleGenerativeAI(
    model="gemini-1.5-flash",
    verbose=True,
    temperature=0.5,
    google_api_key=google_api_key
)

# Crafting a Senior Researcher agent with a rich backstory and sharp focus
news_researcher = Agent(
    role="Senior Researcher",
    goal='Uncover groundbreaking technologies in {topic}',
    verbose=True,
    memory=True,
    backstory=(
        "As a beacon of knowledge, this agent's zeal for discovery knows no bounds."
        " With a keen eye for innovation, it scours the horizon for technological marvels,"
        " ready to illuminate the world with its findings."
    ),
    tools=[tool],
    llm=llm,
    allow_delegation=True
)

# Bringing to life a Writer agent, a storyteller at heart with a mission
news_writer = Agent(
  role='Writer',
  goal='Narrate compelling tech stories about {topic}',
  verbose=True,
  memory=True,
  backstory=(
    "Armed with a pen and a passion for the intricate dance of words, this agent"
    " weaves tales from the threads of complexity. It's on a quest to demystify the tech realm,"
    " turning jargon into journeys, and insights into narratives."
  ),
  tools=[tool],
  llm=llm,
  allow_delegation=False
)

Explanation:

Here's a breakdown of each part of the code:

  1. Importing Necessary Modules: The code begins by importing the required modules. Agent is imported from crewai, which is likely a framework for creating and managing AI agents. tool is imported from a module named tools, which could be a set of functionalities or utilities the agents can use. load_dotenv is a function from the dotenv package that loads environment variables from a .env file, which is a standard way to manage configuration settings securely. ChatGoogleGenerativeAI is imported from langchain_google_genai, which seems to be a wrapper around Google's Gemini language model for generative AI tasks.

  2. Loading Environment Variables: The load_dotenv() function call loads the environment variables, which in this case includes the GOOGLE_API_KEY. key is necessary to authenticate and interact with Google's API for the Gemini language model.

  3. Initializing the Language Model: The ChatGoogleGenerativeAI class is instantiated with specific parameters such as the model type (gemini1.5-flash), verbosity, temperature (which controls the randomness of the generated text), and the GOOGLE_API_KEY.

  4. Creating the Senior Researcher Agent: An instance of Agent is created with the role of "Senior Researcher". This agent has a goal to groundbreaking technologies in a given topic. It is set to be (likely meaning it provides detailed logs or output), has memory (which might allow it to remember past interactions or data), and has a backstory that describes its purpose and motivation. The agent is also given tools and the Google Gemini language model (llm) to work with, and is allowed to delegate tasks (allow_delegation=True), suggesting it can work with other agents or systems.

  5. Creating the Writer Agent: Another Agent instance is created with the role of "Writer". This agent's goal is to narrate compelling tech stories about a given topic. Similar to the researcher, this agent is verbose, has memory, and has a backstory that emphasizes its narrative and educational focus. The writer is also equipped with tools and the llm, but it is not allowed to delegate tasks (allow_delegation=False), indicating it work independently.


Overall, the code sets up two AI agents with distinct roles and capabilities, leveraging the Google Gemini language model to perform tasks related to content creation, such as researching and writing. The agents can potentially work together or separately, depending on the use case, to automate the process of turning video content into written narratives or blogs.

Step 5: Defining Tools

Create a file named tools.py and define the necessary tools:

# Import the load_dotenv function from the dotenv package.
# This function will be used to load environment variables from a .env file.
from dotenv import load_dotenv

# Call the load_dotenv function, which looks for a .env file in the current directory
# and loads the environment variables found there.
load_dotenv()

# Import the os module, which provides a way to interact with the operating system.
import os

# Set the 'SERPER_API' environment variable.
# This variable is retrieved from the .env file loaded earlier,
# allowing secure access to the Serper API.
os.environ['SERPER_API_KEY'] = os.getenv('SERPER_API_KEY')

# Import the SerperDevTool class from the crewai_tools module.
# This class likely provides methods to perform internet searches
# using the Serper.dev API service.
from crewai_tools import SerperDevTool

# Create an instance of the SerperDevTool.
# This tool will be used by the AI agents to perform internet searches
# and gather information as part of their tasks.
tool = SerperDevTool()

Explanation:

This code snippet is a setup script for using a specific tool in the Crew AI ecosystem. The tool in question is SerperDevTool, which is likely used by AI agents to perform internet searches. The key steps are:

  1. Loading Environment Variables: The load_dotenv() function is called to load the environment variables from a .env file. .env files are used to store configuration settings and secrets, such as API keys, which should not be hard-coded into the source code for security reasons.

  2. Setting the 'SERPER_API_KEY': The SERPER_API_KEY environment variable is explicitly set in the os.environ dictionary using the value obtained from the .env file. This is done to ensure that the Serper API key is available in the environment for the SerperDevTool to use.

  3. Initializing the SerperDevTool: The SerperDevTool is initialized and stored in the variable tool. While not explicitly shown in the code snippet, it can be assumed that SerperDevTool is a class provided by the crewai_tools package that facilitates internet searches. It probably uses the SERPER_API_KEY to authenticate requests made to the Serper.dev API.

In the context of Crew AI agents, this tool would then be passed to the agents so they can use it to perform internet searches as part of their tasks, such as researching content or gathering data for analysis.


Step 6: Setting Up Tasks

Create a file named tasks.py and define the tasks for each agent:

# Import the Task class from the crewai module, which is used to define tasks for the agents.
from crewai import Task
# Import the tool that was previously set up, which the agents will use to perform their tasks.
from tools import tool
# Import the news_researcher and news_writer agents that were previously created.
from agents import news_researcher, news_writer

# Define a research task for the news_researcher agent.
# The task involves identifying trends, analyzing pros and cons, and providing a detailed report.
research_task = Task(
  description=(
    "Identify the next big trend in {topic}."
    "Focus on identifying pros and cons and the overall narrative."
    "Your final report should clearly articulate the key points,"
    "its market opportunities, and potential risks."
  ),
  # Specify the expected output format for the task - a comprehensive report consisting of three paragraphs.
  expected_output='A comprehensive 3 paragraphs long report on the latest AI trends.',
  # Assign the previously initialized tool to be used for this task.
  tools=[tool],
  # Assign the news_researcher agent to carry out this task.
  agent=news_researcher,
)

# Define a writing task for the news_writer agent.
# The task involves composing an article that is insightful, easy to understand, and engaging.
write_task = Task(
  description=(
    "Compose an insightful article on {topic}."
    "Focus on the latest trends and how it's impacting the industry."
    "This article should be easy to understand, engaging, and positive."
  ),
  # Specify the expected output format for the task - a four-paragraph article formatted in markdown.
  expected_output='A 4 paragraph article on {topic} advancements formatted as markdown.',
  # Assign the same tool to be used for this writing task.
  tools=[tool],
  # Assign the news_writer agent to carry out this task.
  agent=news_writer,
  # Set the execution mode of the task to synchronous (the default is async).
  async_execution=False,
  # Specify the filename where the output of the task will be saved.
  output_file='new-blog-post.md'
)

Explanation:

This code snippet is designed to define tasks for two AI agents – a researcher and a writer – within the Crew AI framework. These tasks are encapsulated within Task objects, which contain details about the task description, the expected output, the tools to be used, and the agent responsible for the task.

  1. Research Task: A Task is created for the news_researcher agent, specifying that it should identify and analyze a new trend within a given topic. The task description outlines the focus areas, including the trend's pros and cons, market opportunities, and potential risks. The expected output is a structured three-paragraph report on the identified trends.

  2. Writing Task: Another Task is created for the news_writer agent, focusing on composing an engaging and informative article on a specified topic. The article should cover the latest trends and their impact on the industry, formatted in markdown as a four-paragraph piece. This task is set to execute synchronously, meaning it will run in the foreground and wait for completion before moving on to other tasks. The output of this task will be saved to a file named new-blog-post.md.

By defining these tasks, each agent has a clear objective and output goal, which allows them to autonomously execute their functions while leveraging the specified tools and resources. The structured approach helps in managing the workflow of content creation, from research to writing, within the Crew AI ecosystem.


Step 7: Executing the Workflow

Create a file named crew.py to execute the workflow:

# Import Crew and Process classes from the crewai module.
# Crew is used to form a group of agents working together, and Process defines the task execution flow.
from crewai import Crew, Process

# Import the previously defined tasks for research and writing.
from tasks import research_task, write_task

# Import the previously created agent instances for research and writing.
from agents import news_researcher, news_writer

# Create a Crew instance with the research and writer agents.
# This Crew will focus on tech-related tasks and is configured to execute tasks sequentially.
crew = Crew(
    # Assign the list of agents to the Crew; these agents will collaborate on assigned tasks.
    agents=[news_researcher, news_writer],
    # Assign the list of tasks to the Crew; these are the tasks the agents will work on.
    tasks=[research_task, write_task],
    # Set the task execution process to be sequential, meaning one task will be completed before the next starts.
    process=Process.sequential,
)

# Start the process of the Crew with a specific input for the topic.
# In this case, the topic is 'AI in healthcare', which will be the focus of the research and writing tasks.
result = crew.kickoff(inputs={'topic': 'ChatGPT and Generative AI'})

# Print the result of the task execution process to get feedback the outcomes.
print(result)

Explanation:

In this code snippet, we're bringing together the components of a Crew AI system to execute specific tasks related to 'AI in healthcare'. The Crew class is used to define a group of agents that will work together, and the Process class determines how the tasks will be executed within the crew.

  1. Creating the Crew: A Crew instance is formed by grouping the news_researcher and news_writer agents together. The Crew is also assigned a set of tasks (research_task and write_task) that the agents need to complete. The process is set to Process.sequential, which means that the tasks will be executed one after the other, in the order they were added to the crew.

  2. Executing the Tasks: The kickoff method is called on the crew instance to start the execution of the tasks. The method is provided with an inputs dictionary that specifies the topic for the tasks, which in this case is 'AI in healthcare'. This input is passed on to the agents and will be used to guide their research and writing.

  3. Output and Feedback: The result of the kickoff method, which contains the outcomes of the executed tasks, is stored in the result variable and then printed to the console. This allows for immediate feedback on the work done by the crew, including the quality and content of the research report and the written article.

Overall, this code demonstrates how to orchestrate the work of multiple AI agents within the Crew AI framework to efficiently complete a set of related tasks with a common focus.


Step 8: Running the Implementation

Execute the crew.py script to see the agents in action:

python crew.py

The script initializes the agents, assigns tasks, and uses the Google Gemini model to perform the required actions. The sequential process ensures that the researcher's findings are passed to the writer, who then compiles a comprehensive blog post.


Conclusion

With Crew AI and Google Gemini, we can automate complex tasks involving multiple AI agents. This approach not only saves time but also ensures consistency and accuracy in the generated content. Whether you're creating blog posts, researching the latest trends, or generating insightful articles, Crew AI and Google Gemini make the process efficient and seamless.

Stay tuned for more tutorials and advanced use cases using Crew AI and various AI models. Happy coding!


Comments

Popular posts from this blog

Step-by-Step Crew AI: Turn YouTube Videos into Blog Gems

Understanding the Basics of Generative AI and Its Applications