Skip to main content
Connect ElevenLabs Conversational AI agents to Anam avatars with a server-side integration. The Anam engine connects directly to ElevenLabs—your client only needs the Anam SDK. Key benefits:
  • Simpler client code (no audio bridging, microphone management, or speaker muting)
  • Reduced latency through server-to-server audio flow
  • Session recordings and transcripts available in Anam Lab
Looking for the client-side approach where you manage the audio pipeline in the browser? See Custom TTS (client-side).

Architecture

Client ──WebRTC──▶ Anam Engine ◀──WebSocket──▶ ElevenLabs
                        │                    (STT → LLM → TTS)
                  Face generation

                  Video + Audio

                     Client
1

Server fetches signed URL and session token

Your API route fetches an ElevenLabs signed URL using your API key, then requests an Anam session token with elevenLabsAgentSettings attached.
2

Engine connects to ElevenLabs

The Anam engine uses the signed URL to open a WebSocket to ElevenLabs and manages the full voice pipeline—speech-to-text, LLM reasoning, and text-to-speech.
3

Client streams avatar

The client creates an AnamClient with the session token and calls streamToVideoElement(). Mic audio goes to the engine over WebRTC; the avatar video and speech audio come back over the same connection.
4

No ElevenLabs SDK on the client

The only client dependency is @anam-ai/js-sdk.

Prerequisites

  • Node.js 18+
  • Anam account and API key
  • ElevenLabs account with a Conversational AI agent configured
ElevenLabs API key permissions: As of March 2026, after creating an API key in the ElevenLabs dashboard you need to edit it and explicitly grant write access to Conversational AI (ElevenLabs Agents). This permission is not enabled by default due to a bug in the ElevenLabs UI.

Best Practices for ElevenLabs Agent Configuration

Before writing code, configure your ElevenLabs agent for good performance with Anam:

Voice Settings

  • Use V3 Conversational as the TTS model for better expressivity
  • Enable Expressive mode on V3 voices
  • Add audio tags to system prompts for effects like laughter

Audio Configuration

  • Set user input audio format to PCM 16000Hz (other formats are not supported with Anam)
  • Enable Filter Background Speech in Advanced settings if background noise is problematic

Response Optimization

  • As of writing, Qwen3-30B-A3B performs well for low latency — check the ElevenLabs agent UI for current LLM options and their latency characteristics
  • Avoid reasoning models unless using high-throughput providers
  • Set Eagerness to “Eager” in Advanced menu for quickest responses
  • Configure soft timeouts (2 seconds) in Advanced settings with filler phrase generation if responses lag

Server-Side Implementation

Environment Variables

.env
ANAM_API_KEY=your_anam_api_key
ELEVENLABS_API_KEY=your_elevenlabs_api_key
You’ll also need two IDs per persona:
VariableWhat it isWhere to find it
agentIdYour ElevenLabs Agent IDElevenLabs dashboard → Agents → select your agent → copy the Agent ID
avatarIdAn Anam avatar face ID (not a persona ID)Avatar Gallery, or in Anam Lab click the three-dot menu on an avatar and then click the copy button
The avatarId is specifically the face model ID, not an overall persona or agent ID. You’re pairing an Anam face with an ElevenLabs agent—the voice, LLM, and STT all come from ElevenLabs.

API Route

Create a Next.js API route (or equivalent server endpoint) that fetches the ElevenLabs signed URL and creates an Anam session token:
app/api/anam-session/route.ts
import { NextResponse } from "next/server";

export async function POST(request: Request) {
  const anamApiKey = process.env.ANAM_API_KEY;
  if (!anamApiKey) {
    return NextResponse.json(
      { error: "ANAM_API_KEY must be set" },
      { status: 500 }
    );
  }

  const elevenLabsApiKey = process.env.ELEVENLABS_API_KEY;
  if (!elevenLabsApiKey) {
    return NextResponse.json(
      { error: "ELEVENLABS_API_KEY must be set" },
      { status: 500 }
    );
  }

  const body = await request.json().catch(() => ({}));
  const { avatarId, agentId } = body;

  if (!avatarId) {
    return NextResponse.json(
      { error: "avatarId is required" },
      { status: 400 }
    );
  }
  if (!agentId) {
    return NextResponse.json(
      { error: "agentId is required" },
      { status: 400 }
    );
  }

  // 1. Get a signed URL from ElevenLabs
  const elRes = await fetch(
    `https://api.elevenlabs.io/v1/convai/conversation/get-signed-url?agent_id=${agentId}`,
    {
      headers: { "xi-api-key": elevenLabsApiKey },
    }
  );

  if (!elRes.ok) {
    const text = await elRes.text();
    return NextResponse.json(
      { error: `ElevenLabs API error: ${elRes.status} ${text}` },
      { status: elRes.status }
    );
  }

  const { signed_url: signedUrl } = await elRes.json();

  // 2. Create an Anam session token with the ElevenLabs agent settings
  const anamRes = await fetch("https://api.anam.ai/v1/auth/session-token", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${anamApiKey}`,
    },
    body: JSON.stringify({
      personaConfig: { avatarId },
      environment: {
        elevenLabsAgentSettings: {
          signedUrl,
          agentId,
        },
      },
    }),
  });

  if (!anamRes.ok) {
    const text = await anamRes.text();
    return NextResponse.json(
      { error: `Anam API error: ${anamRes.status} ${text}` },
      { status: anamRes.status }
    );
  }

  const data = await anamRes.json();
  return NextResponse.json({ sessionToken: data.sessionToken });
}
The environment.elevenLabsAgentSettings field tells the Anam engine to connect to ElevenLabs instead of running Anam’s built-in STT/LLM/TTS pipeline.
Signed URLs expire in approximately 15 minutes. Create tokens immediately before client use rather than pre-fetching them.

Per-Session Customization

The Anam session token API accepts additional fields in elevenLabsAgentSettings that are passed through to ElevenLabs:
body: JSON.stringify({
  personaConfig: { avatarId },
  environment: {
    elevenLabsAgentSettings: {
      signedUrl,
      agentId,
      dynamicVariables: { ... },          // optional
      conversationConfigOverride: { ... }, // optional
      userId: "...",                       // optional
      customLlmExtraBody: { ... },        // optional
    },
  },
})
The dynamicVariables, conversationConfigOverride, and customLlmExtraBody fields each have a 10KB size limit.

Dynamic Variables

Define placeholders like {{user_name}} in your ElevenLabs agent system prompt, then populate them at runtime:
dynamicVariables: {
  user_name: "Alice",
  account_type: "premium",
}
See ElevenLabs dynamic variables documentation for complete syntax details.

Configuration Overrides

Modify per-conversation settings like first message, language, system prompt, or TTS voice:
conversationConfigOverride: {
  agent: {
    prompt: {
      prompt: "You are a helpful assistant. Always respond in Spanish.",
    },
    firstMessage: "¡Hola! ¿En qué puedo ayudarte hoy?",
    language: "es",
  },
}
Refer to ElevenLabs overrides documentation for the complete list of overridable fields.

User Identification

Pass userId for analytics tracking in ElevenLabs:
userId: "user_abc123"

Custom LLM Parameters

If your ElevenLabs agent uses a custom LLM backend, pass additional parameters:
customLlmExtraBody: {
  session_context: { region: "eu", tier: "enterprise" },
}

Client-Side Implementation

The client code is minimal—just fetch a session token and stream:
import { createClient } from "@anam-ai/js-sdk";

const res = await fetch("/api/anam-session", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ avatarId, agentId }),
});
const { sessionToken } = await res.json();

const client = createClient(sessionToken);
await client.streamToVideoElement("avatar-video");

React Component Example

"use client";

import { useRef, useState, useCallback } from "react";
import {
  AnamEvent,
  createClient,
  type AnamClient,
  type MessageStreamEvent,
} from "@anam-ai/js-sdk";

type Message = {
  id: string;
  role: "user" | "persona";
  content: string;
  interrupted?: boolean;
};

export default function AvatarChat({
  avatarId,
  agentId,
}: {
  avatarId: string;
  agentId: string;
}) {
  const clientRef = useRef<AnamClient | null>(null);
  const [status, setStatus] = useState<"idle" | "connecting" | "connected">("idle");
  const [messages, setMessages] = useState<Message[]>([]);

  const start = useCallback(async () => {
    setStatus("connecting");
    setMessages([]);

    const res = await fetch("/api/anam-session", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ avatarId, agentId }),
    });
    const { sessionToken } = await res.json();

    const anamClient = createClient(sessionToken);
    clientRef.current = anamClient;

    // Accumulate transcript chunks by message ID
    anamClient.addListener(
      AnamEvent.MESSAGE_STREAM_EVENT_RECEIVED,
      (evt: MessageStreamEvent) => {
        setMessages((prev) => {
          const idx = prev.findIndex((m) => m.id === evt.id);
          if (idx >= 0) {
            const next = [...prev];
            next[idx] = {
              ...next[idx],
              content: next[idx].content + evt.content,
              interrupted: evt.interrupted,
            };
            return next;
          }
          return [
            ...prev,
            {
              id: evt.id,
              role: evt.role as "user" | "persona",
              content: evt.content,
              interrupted: evt.interrupted,
            },
          ];
        });
      }
    );

    anamClient.addListener(AnamEvent.CONNECTION_CLOSED, () => {
      setStatus("idle");
    });

    await anamClient.streamToVideoElement("avatar-video");
    setStatus("connected");
  }, [avatarId, agentId]);

  const stop = useCallback(async () => {
    await clientRef.current?.stopStreaming();
    clientRef.current = null;
    setStatus("idle");
  }, []);

  return (
    <div>
      <video id="avatar-video" autoPlay playsInline />
      <button
        disabled={status === "connecting"}
        onClick={status === "connected" ? stop : start}
      >
        {status === "connecting" ? "Connecting..." : status === "connected" ? "Stop" : "Start"}
      </button>
    </div>
  );
}
The MESSAGE_STREAM_EVENT_RECEIVED event fires for each text chunk from user and agent. Accumulate chunks by message ID to construct full transcripts. Each event also includes endOfSpeech to indicate when a message is complete.

Feature Support

Works with server-side integration:
  • Voice intelligence (STT, LLM, TTS)
  • Expressive V3 voices
  • Interruption handling
  • Custom knowledge bases
  • Server-side tools (webhooks)
  • Conversation history
  • Session recordings and transcripts in Anam Lab
Not yet supported:

Troubleshooting

  • After creating an API key in the ElevenLabs dashboard, you must edit the key and grant write access to Conversational AI (ElevenLabs Agents). This permission is not enabled by default.
  • Go to ElevenLabs → API Keys → click the key → enable the Conversational AI permission → save.
  • Verify ELEVENLABS_API_KEY is valid and has Conversational AI permissions (see above)
  • Confirm the agentId exists in your ElevenLabs dashboard
  • Verify your ElevenLabs plan includes Conversational AI access
  • Signed URL may have expired—create token immediately before client needs it
  • Verify ElevenLabs agent is active (not paused) in the dashboard
  • Validate avatarId at lab.anam.ai/avatars
  • Ensure elevenLabsAgentSettings includes both signedUrl and agentId
  • The server-side integration handles audio format matching automatically
  • Check ElevenLabs agent configuration for supported voice models
  • Recordings generate after session ends—allow several minutes for processing
  • Verify session completed cleanly (client called stopStreaming() or normal connection closure)

Resources