Skip to main content
The Anam LiveKit plugin enables you to add avatars to your LiveKit agent applications. Combine Anam’s avatar technology with any STT, LLM or TTS—including OpenAI Realtime, Gemini Live, or your own custom models—to create engaging AI experiences.

Demo

See the Anam + LiveKit integration in action with our onboarding assistant demo:
Anam LiveKit Demo - AI Onboarding Assistant

View Demo Source Code

Full source code for the onboarding assistant demo with Gemini vision and screen share analysis.

Installation

pip install livekit-plugins-anam

Quick Start

The simplest way to add an Anam avatar to your LiveKit agent:
import os
from livekit.agents import Agent, AgentSession, JobContext, WorkerOptions, cli
from livekit.plugins import openai, anam

async def entrypoint(ctx: JobContext):
    await ctx.connect()
    
    # Create agent session with OpenAI Realtime
    session = AgentSession(
        llm=openai.realtime.RealtimeModel(voice="alloy"),
    )

    # Configure Anam avatar
    avatar = anam.AvatarSession(
        persona_config=anam.PersonaConfig(
            name="Cara",
            avatarId="a49abb10-9a29-4099-b950-e68534742fb2",
        ),
        api_key=os.getenv("ANAM_API_KEY"),
    )
    
    # Start the avatar and agent
    await avatar.start(session, room=ctx.room)
    await session.start(
        agent=Agent(instructions="You are a helpful assistant."),
        room=ctx.room,
    )
    
    # Generate initial greeting
    session.generate_reply(instructions="Say hello to the user")

if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))

Architecture

The Anam LiveKit plugin acts as a visual layer for your AI agent:
User Input (Voice/Video)

LiveKit Room (Real-time Communication)

Your LLM (OpenAI, Gemini, Claude, etc.)

Text Response → Anam Avatar (TTS + Video)

User sees and hears the avatar
Bring Your Own LLM: Anam handles only the visual avatar. You choose the ears, intelligence and voice—whether that’s DeepGram, ElevenLabs, Cartesia, OpenAI, Gemini, Claude, or a custom model trained on your data.

Configuration

Environment Variables

1

Get your API credentials

You’ll need credentials from at least three services:
ServiceWhere to get it
AnamAnam Dashboard
LiveKitLiveKit Cloud or self-hosted
Other ProvidersDeepGram, ElevenLabs, OpenAI, Google AI Studio, etc.
2

Set environment variables

Create a .env file with your credentials:
.env
# Anam credentials
ANAM_API_KEY=your_anam_api_key
ANAM_AVATAR_ID=your_avatar_id

# LiveKit credentials
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your_livekit_api_key
LIVEKIT_API_SECRET=your_livekit_api_secret

# Other credentials (choose your providers)
OPENAI_API_KEY=your_openai_api_key
# or
GEMINI_API_KEY=your_gemini_api_key

PersonaConfig Options

Configure your avatar’s identity:
persona_config = anam.PersonaConfig(
    name="Maya",           # Display name for the avatar
    avatarId="uuid-here",  # Avatar appearance ID
)

Choosing an Avatar

Advanced Examples

Using Gemini with Vision

This example shows how to use Gemini Live for multimodal conversations with screen share analysis:
import os
from livekit.agents import Agent, AgentSession, JobContext, WorkerOptions, cli
from livekit.agents.voice import VoiceActivityVideoSampler, room_io
from livekit.plugins import anam, google

async def entrypoint(ctx: JobContext):
    await ctx.connect()
    
    # Gemini Live model with vision capabilities
    llm = google.realtime.RealtimeModel(
        model="gemini-2.0-flash-exp",
        api_key=os.getenv("GEMINI_API_KEY"),
        voice="Aoede",
        instructions="You are a helpful assistant that can see the user's screen.",
    )
    
    # Anam avatar
    avatar = anam.AvatarSession(
        persona_config=anam.PersonaConfig(
            name="Maya",
            avatarId=os.getenv("ANAM_AVATAR_ID"),
        ),
        api_key=os.getenv("ANAM_API_KEY"),
    )
    
    # Agent session with video sampling for screen analysis
    session = AgentSession(
        llm=llm,
        video_sampler=VoiceActivityVideoSampler(
            speaking_fps=0.2,  # 1 frame every 5 sec when speaking
            silent_fps=0.1,    # 1 frame every 10 sec when silent
        ),
    )
    
    await avatar.start(session, room=ctx.room)
    await session.start(
        agent=Agent(instructions="Help the user with what you see on their screen."),
        room=ctx.room,
        room_input_options=room_io.RoomInputOptions(video_enabled=True),
    )

if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))

Adding Function Tools

Extend your agent with custom tools that can take actions:
from livekit.agents import function_tool

@function_tool
async def fill_form_field(field_name: str, value: str) -> str:
    """Fill in a form field on the user's screen.
    
    Args:
        field_name: The name of the field to fill
        value: The value to enter
    
    Returns:
        Confirmation message
    """
    # Your implementation here
    await send_command_to_frontend("fill_field", {"field": field_name, "value": value})
    return "Field filled successfully"

# Include tools in your session
session = AgentSession(
    llm=llm,
    tools=[fill_form_field],
)

Running Your Agent

  • Development
  • Production
Use the LiveKit CLI for local development:
python agent.py dev
This connects to your LiveKit server and automatically joins rooms when participants connect.

Use Cases

The Anam + LiveKit combination is ideal for scenarios requiring voice interaction with visual presence:
Guide new hires through forms and processes with screen share analysis. The AI sees what they see and provides contextual help.
Help students with homework by seeing their work. The avatar can point out errors and explain concepts visually.
See customer screens and provide step-by-step guidance with a friendly visual presence.
Assist patients filling out medical forms with a calm, reassuring avatar presence.
Guide users through account opening, KYC processes, and complex financial forms.

Troubleshooting

  • Verify LIVEKIT_URL, LIVEKIT_API_KEY, and LIVEKIT_API_SECRET are correct
  • Check that your LiveKit server is accessible
  • Ensure WebSocket connections aren’t blocked by a firewall
  • Test connectivity at meet.livekit.io
  • Verify your ANAM_API_KEY is valid
  • Check that ANAM_AVATAR_ID matches an existing avatar
  • Review agent logs for Anam connection errors
  • Ensure the avatar session starts before the agent session
  • Check your LLM API key is valid (OpenAI, Gemini, etc.)
  • Verify microphone permissions in the browser
  • Look for API errors in the agent logs
  • Confirm the agent is receiving audio tracks
  • Ensure you’re sharing the correct tab or window
  • Check browser permissions for screen sharing
  • Look for “Screen share track detected” in agent logs
  • Verify video_enabled=True in room input options
  • Check your network connection stability
  • Consider using LiveKit Cloud for optimized routing
  • Reduce video sampling frequency if CPU-bound
  • Monitor your LLM API response times

API Reference

AvatarSession

The main class for integrating Anam avatars with LiveKit agents.
avatar = anam.AvatarSession(
    persona_config=anam.PersonaConfig(...),
    api_key="your_api_key",
    api_url="https://api.anam.ai",  # Optional: defaults to production
)
persona_config
PersonaConfig
required
Configuration for the avatar’s identity and appearance.
api_key
string
required
Your Anam API key from the dashboard.
api_url
string
default:"https://api.anam.ai"
Anam API endpoint. Override for staging or self-hosted deployments.

PersonaConfig

name
string
required
Display name for the avatar. Used in logs and debugging.
avatarId
string
required
UUID of the avatar to use. Get this from the Avatar Gallery or Anam Lab.

Methods

start()

Starts the avatar session and connects it to the LiveKit room.
await avatar.start(session, room=ctx.room)
session
AgentSession
required
The LiveKit agent session to connect the avatar to.
room
rtc.Room
required
The LiveKit room instance from the job context.

Resources

Support