How to Build an AI Avatar Morning Assistant (Full Pipecat Tutorial)

link to full video tutorial - https://youtu.be/9nQJTDeY6pQ?si=HKAaiCEUlv7Dvenn

Imagine waking up not to a buzzing alarm, but to a friendly face - a personalized AI assistant that greets you, reads out your calendar, checks your emails, and sends a summary to your phone via WhatsApp.

In this tutorial, we are going to take a basic voice agent and upgrade it into the ultimate morning assistant. We aren’t just building a chatbot; we are building an agent with function calling capabilities and a real-time video avatar.

If you want to jump straight into the code, you can find the full repository here: GitHub: Pipecat-Twilio

The Tech Stack & Architecture

Before we code, it helps to understand the "pipeline" of data. We are using Pipecat, an open-source framework for building voice agents. Here is the flow of data:

Transport/RTVI: Captures your voice audio.

Deepgram: Transcribes your speech into text.

OpenAI (LLM): The "brain" that decides what to say or what tool to use.

Functions: Custom Python code (Calendar, Gmail, Twilio).

Cartesia: Generates high-quality text-to-speech audio (Sonic 3 model).

Tavus: syncs that audio to a video avatar in real-time.

Step 1: Project Setup

First, we need to clone the Pipecat quickstart and set up our environment. Open your terminal and run:

cd pipecat-quickstart

We will be using uv to manage our dependencies. Run uv sync to install the base packages.

You will need to create a .env file and add API keys for the services we are using. You'll need keys for:

Deepgram (Transcription)

OpenAI (Intelligence)

Cartesia (Voice)

Tavus (Video Avatar)

Twilio (WhatsApp)

(include picture of the .env file structure)

Step 2: Integrating Google Calendar and Gmail

To make this assistant actually helpful, it needs access to our personal data. We will create a file called functions.py to handle all our external tools.

Google Cloud Setup

You’ll need to go to the Google Cloud Console, enable the Google Calendar API and Gmail API, and download your credentials.json file. Place this file in your project folder.

The Code

In functions.py, we write functions to authenticate with Google and fetch data. Here is the logic for fetching calendar events:

Python

async def get_calendar_events(a, b, c):

# Authenticate and build the service

creds = get_google_credentials()

service = build('calendar', 'v3', credentials=creds)

 

# Get start and end of day

now = datetime.datetime.utcnow()

# ... (date formatting logic) ...

# Call the API

events_result = service.events().list(

calendarId='primary', timeMin=start_of_day,

timeMax=end_of_day, singleEvents=True,

orderBy='startTime'

).execute()

 

# Filter and return JSON

events = events_result.get('items', [])

# ... (filtering logic) ...

return json.dumps(filtered_events)

We do the same for Gmail, fetching the last few emails to check for anything urgent.

Step 3: Registering Tools in bot.py

Now we need to tell our OpenAI bot that these tools exist. In bot.py, we import our functions and create Tool Definitions. This describes to the AI when it should use a specific tool.

Python

# In bot.py

# Import functions

from functions import get_calendar_events, get_gmail_emails

# Define the tools for the LLM

calendar_tool_definition = {

"type": "function",

"function": {

"name": "get_calendar_events",

"description": "Get calendar events for the current day. Use this when the user asks about their schedule.",

"parameters": { "type": "object", "properties": {} }

}

}

# Register the functions

await bot.register_function("get_calendar_events", get_calendar_events)

await bot.register_function("get_gmail_emails", get_gmail_emails)

We also update the system prompt to ensure the AI knows its role: "You are a friendly AI assistant. Your goal is to manage the user's morning by checking their schedule and emails."

Step 4: Adding WhatsApp Integration

It's great to hear your schedule, but you’ll want a written record for the day. We use Twilio to send a WhatsApp summary.

Sign up for Twilio and open the WhatsApp Sandbox.

Send the join code from your phone to the Twilio number.

Add your TWILIO_ACCOUNT_SID and TWILIO_AUTH_TOKEN to your .env file.

In functions.py, we add the function to send the message:

Python

async def send_whatsapp_reminder(reminder_text):

client = Client(account_sid, auth_token)

 

message = client.messages.create(

from_=twilio_whatsapp_number,

body=reminder_text,

to=recipient_number

)

return json.dumps({"status": "sent", "sid": message.sid})

Now, when you ask the bot, "Send me a summary," it will trigger this function and text you immediately.

Step 5: The Face of the Assistant (Tavus Video)

Finally, to take this from a voice bot to a video assistant, we integrate Tavus.

In bot.py, we switch the pipeline to support video output. We select a "Replica" (I used the 'Gloria' avatar) and configure the TavusVideoService.

Python

# In bot.py

tavus = TavusVideoService(

api_key=os.getenv("TAVUS_API_KEY"),

replica_id=os.getenv("TAVUS_REPLICA_ID"),

session=session

)

# Add Tavus to the pipeline after TTS

pipeline = Pipeline([

transport.input(),

stt,

context_aggregator,

llm,

tts,

tavus, # The video generation layer

transport.output()

])

The Final Result

When we run uv run bot.py and connect via the browser, we get a fully interactive experience.

You: "What's on my calendar?"

AI (Checking Google): "You have a stand-up at 10 AM and lunch with Sam at 1 PM."

You: "Send me a summary."

AI (Using Twilio): "Sent! Check your WhatsApp."

This project demonstrates the power of combining LLMs with external APIs and real-time media generation.

If you enjoyed this tutorial, be sure to check out the video for the live demo and star the repository on GitHub if you found it useful!

Link to GitHub Repository - https://github.com/jb-akp/Pipecat-Twilio