Changelog - Anam

MeetingsAPILab

Invite personas to Google Meet, Zoom, and Microsoft Teams calls

Spotlight: Invite personas to your meetings

Personas can now join Google Meet, Zoom, and Microsoft Teams calls as participants. Create an invite with a meeting URL and a persona via the new Meetings API, and the persona joins the call, either immediately or at a scheduled time up to 7 days ahead. Scheduling at least 10 minutes in advance reserves guaranteed capacity for the join, which is the most reliable way to get a persona into a planned meeting.In group calls the persona joins silently and only responds when addressed by its display name; for 1:1 calls it can greet on join and respond to everything. You choose the region the persona joins from (eu, us-east, or us-west), with a strict mode for data-residency requirements. Once the persona is in the call, the invite links to a regular session, so transcripts and session details work through the existing Sessions API. Meeting participants always see that the persona is an AI: display names carry an “(AI)” suffix and the persona’s video includes a persistent AI-disclosure treatment.Read the Meetings guide to get started.

Lab Changes

Improvements

Invite from the Build page: Published personas have a new Invite action that adds them to a meeting directly from the Lab.

SDK/API Changes

Improvements

Meetings API: New /v1/meetings/invites endpoints to create, list, get, and cancel meeting invites, gated by a new meetings API key permission scope. See the API reference.

LabPersonaSDKAPI

Faster persona session connections, clearer billing, team management, and broader API controls

Spotlight: Faster time to first frame for persona sessions

Persona sessions now connect much faster, with major improvements to time to first frame. We reduced work on the session-start path, improved WebRTC/TURN connection handling, warmed connection paths earlier, and made startup more resilient to transient failures.The improvement is already visible in production: US persona sessions are now seeing p50 time to first frame below 1 second, with p95 below 3 seconds.

Lab Changes

Improvements

Team management: Added a dedicated Team page for invites, member lists, roles, removals, and pending invitations.
Billing visibility: Billing is now a persistent nav item, with plan/usage status in the Lab and upcoming-invoice previews on the subscription page.
Build and session performance: Build page data now loads in parallel, session-token and session-start paths do less blocking work, and connection warmups/regional routing reduce time to playable sessions.

Fixes

Knowledge Library downloads: Restored open and download actions for files in the Knowledge Library modal.
Microphone controls: Fixed mic discovery before first session and added microphone selection to public share-link sessions.
Safari and iOS compatibility: Fixed Safari voice-clone recording previews, iOS fullscreen errors, and older Safari microphone-permission checks.
Security hardening: Tightened server-side URL fetching, resource ownership checks, share-link protections, upload quota validation, pagination limits, anonymous metrics limits, and auth session handling.

Persona Changes

Improvements

Mid-session config updates: Live sessions can now receive language, voice generation, and voice detection updates over the data channel without reconnecting.
Connection reliability: Cloudflare TURN is now the default path with fallback, WebRTC forwarding has safer drain/flow handling, and regional Deepgram URLs improve transcription routing.

Fixes

Turn-taking stability: Fixed rapid-interrupt edge cases, duplicate/missing LiveKit end-of-turn signals, and avatar freezes during interrupt transitions.
Silence and end-call behavior: Silence-breaker prompts now stay in the conversation language and no longer accidentally trigger the end_call tool.
Tool-call resilience: Personas recover more gracefully from malformed model streams or hallucinated tool calls before failing over.
Session recordings: Recording now waits for the primary WebRTC connection before starting, reducing missing recordings caused by early recorder connection attempts.

SDK/API Changes

Improvements

RTC configuration control: JavaScript SDK v4.15.0 adds rtcConfiguration, including support for forcing TURN relay with iceTransportPolicy: "relay".
Session start retries: JavaScript SDK v4.14.0 retries startSession on transient failures.
Direct session start: POST /v1/engine/session can now accept a raw API key with session config in the request body for server-side SDKs and backend integrations.
Widget config updates: PUT /v1/personas/{id} can update embed widget settings such as call-to-action text and allowed origins.
Developer guidance: Docs now cover mid-session updates, rtcConfiguration, TURN-relay forcing, and registering event/tool handlers before starting a session.

Fixes

LiveKit compatibility: Restored field-based persona config inference so existing LiveKit plugin payloads continue to create sessions.
API-managed personas: API-created and API-updated personas are no longer cleaned up by stale Lab draft cleanup.
Persona ownership validation: API create/update and draft/publish paths now enforce organization scoping for referenced avatars, voices, and LLMs.
Webhook and external URL safety: Webhook tool URLs, avatar image URLs, custom LLM URLs, and avatar-source fetches now use stricter public-URL validation.

LabPersonaAPI

A refreshed Lab experience, better session controls, and cleaner voice/session workflows

Spotlight: Refreshed Lab and better session controls

The Lab has a refreshed interface across Build, Personas, Sessions, Dashboard, and API Keys, with cleaner navigation, updated controls, improved tables, and a more consistent design system throughout the product.Builders also have more predictable control over session behavior. Silence prompts and automatic session endings can now be disabled by setting their timeout values to 0, and the confirmed end-call flow records clearer end reasons in session reports.

Lab Changes

Improvements

Lab refresh: Updated the main Lab experience across Build, Personas, Sessions, Dashboard, and API Keys with refreshed navigation, controls, tables, and styling.
Custom LLM editing: The Build page now supports full custom LLM editing, including URL, format, model/deployment/API version, reasoning settings, description, and safe API key rotation.
Sessions History: Added server-side search, API key/date filters, filtered pagination, and more reliable CSV exports.
Voice discovery: Voice language filters now reflect supported Cartesia Sonic 3.5 and ElevenLabs Flash v2.5 languages, with improved locale/accent matching and neutral localized sample text.

Fixes

Persona autosave: Added timeouts and retries so personas no longer get stuck in “Saving” and block publishing indefinitely.
Avatar player layout: The avatar player now responds better to limited vertical space so video and config controls remain visible.
Knowledge Library names: Long folder and file names now truncate more reliably so counts and action buttons stay reachable.

Persona Changes

Improvements

Session controls: silenceBeforeSkipTurnSeconds: 0 and silenceBeforeSessionEndSeconds: 0 now disable silence prompts and automatic session endings.
End-call flow: Added a confirmed end_call flow with standardized close messages and end_reason / end_message in session reports.
Multilingual transcription: Deepgram Flux now defaults to the multilingual model with supported language hints.
Conversation context: Increased the default agentic LLM message history from 8 to 20 messages for more coherent longer conversations.

Fixes

End-call reliability: Added guards against repeated end_call loops and incorrect turn-finished events.
Reasoning text safety: Leaked <think> blocks are scrubbed from spoken output and message history while preserved for reasoning aggregation.
ElevenLabs transcripts: Fixed missing spaces when rebuilding assistant transcripts from ElevenLabs alignment chunks.

SDK/API Changes

Improvements

Voice detection options: API validation and Swagger now document the new 0 disable behavior and expanded timeout ranges for silence controls.
Avatar API reference: Clarified avatar media fields, including signed idling preview URLs and the current public avatar model mapping.

Fixes

Validation consistency: Session-token and Lab persona validation now agree on silence timeout ranges and descriptions.
SDK logging: JavaScript SDK log wording has been cleaned up for clearer developer diagnostics.

Cara 4LabPersonaAPI

Cara 4 early access, higher-quality voice cloning, and smoother Knowledge/API workflows

Spotlight: Cara 4 early access

Cara 4 is now available in early access for enabled organizations. It brings higher-resolution avatar output, stronger expressivity, and improved custom avatar creation for teams testing the next generation of Anam avatars.Once access is enabled for an organization, builders can select Cara 4 (Latest) in the Lab or set avatarModel: "cara-4-latest" when creating a persona or session token via the API.

Lab Changes

Improvements

Cara 4 early access: Enabled organizations can now try Cara 4 from the Build page model selector and use the new early-access setup guide.
Knowledge Library: Redesigned the Knowledge Library, upload, folder, and batch-upload dialogs with cleaner layouts, clearer file states, and a smoother upload flow.
Voice cloning quality: Cartesia voice clones now use Sonic 3.5, improving clone quality and expressiveness for generated voices.
Configurable first messages: The Lab now supports custom persona first messages, making it easier to control how a session opens.

Fixes

Knowledge deletion: Knowledge document and folder deletion now returns immediately instead of blocking while vector and file cleanup finishes in the background.
Knowledge cleanup reliability: Fixed cleanup deadlocks around large legacy documents so deleted Knowledge files are purged more reliably.
Long Knowledge names: Long folder and file names now truncate correctly, keep actions reachable, and validate upload/rename limits before they hit database errors.
Lab Home templates: Fixed Lab Home persona templates that could recite their own system prompts instead of staying in character.
Persona list refresh: Persona lists now refresh correctly after draft autosave and publish changes, reducing stale Build and Personas page states.
Tool interruption setting: Fixed the Tools UI so the interruption-control setting is passed through correctly.

Persona Changes

Improvements

Cara 4 streaming: Tuned Cara 4 frame buffering and bitrate behavior for smoother high-quality playback during early-access sessions.
Audio preprocessing: Updated speech enhancement and VAD handling with newer ai-coustics/Voice Focus support.
Cartesia pronunciation support: Enterprise customers can now request custom Cartesia pronunciation rules for specific words or brand terms.

Fixes

Interrupted greetings: Interrupted first messages are now recorded accurately in conversation history, so personas do not retain text they never actually spoke.
Audio latency: Fixed an audio pipeline issue that could add latency in some sessions.
Turkish turn-taking: Disabled eager end-of-turn behavior for Turkish to reduce premature interruptions.
Audio passthrough avatars: Fixed audio passthrough sessions so the selected avatarModel is passed through correctly.
LLM message tracking: Added safeguards for missing LLM part IDs to reduce message-history edge cases.

SDK/API Changes

Improvements

Cara 4 via API: Enabled organizations can request Cara 4 with avatarModel: "cara-4-latest" when creating session tokens or personas.
OpenAPI accuracy: Fixed OpenAPI/Swagger generation issues, including missing fields and tool-update schema coverage.
Pipecat startup: Updated pipecat-anam alpha releases with a non-blocking startup flow that reduces time to first bot speech, plus improved interrupt handling.

Fixes

Clearer ID errors: Passing an avatar, persona, or voice ID into the wrong field now returns a helpful 400/404 response instead of a generic server error.
Validation status codes: Session-token validation errors now surface as validation failures instead of misleading capacity errors.
Persona API state: Fixed persona API responses that could return draft persona data instead of the latest published persona.
Deleted Knowledge filtering: Added an internal safety check so deleted Knowledge documents are filtered out of RAG results while vector cleanup catches up.

LabPersonaAPI

More predictable session openings, high-quality starts, and cleaner avatar refinement

⚡ More predictable session openings

This release gives builders more control over how sessions begin, especially when a tool-driven turn needs to run cleanly without being interrupted partway through. That makes longer or multi-step tool flows feel more predictable for both builders and end users.On the media side, you can now pin a session to start in high video quality using sessionOptions.videoQuality, which helps sessions reach their intended bitrate faster. We also tightened one-shot avatar refinement so flat or near-solid backgrounds are preserved more reliably in both the Lab and /v1 avatar creation flow.

Lab Changes

Improvements

Better default model: New personas and built-in agent templates now default to GPT OSS 120B instead of GPT OSS 20B, improving reasoning quality and tool use out of the box.

Fixes

Cleaner avatar refinement: Fixed a Gemini refinement issue that could replace plain or near-solid avatar backgrounds with invented scenery, textures, or objects during one-shot avatar creation.

Persona Changes

Improvements

Protected tool turns: Tool-driven turns can now optionally suppress interruptions while your app is still handling the action, making longer or multi-step tool flows more predictable.

Fixes

Protected-turn cleanup: Interrupt protection is now released cleanly when a greeting or tool turn finishes without spoken output, reducing the chance of sessions getting stuck in a protected state.

SDK/API Changes

Improvements

Initial video quality control: sessionOptions.videoQuality now accepts high or auto, letting you pin a session to start at the maximum video bitrate instead of ramping up from the default profile.

Fixes

Avatar API refinement backgrounds: The same background-preservation fix now applies to the /v1 avatar creation flow, so refined API-created avatars are less likely to pick up hallucinated scenery.

DocsLabAPI

A major docs overhaul and better tool visibility across session views

📚 The Anam docs have been overhauled

We redesigned the docs to make it much easier to find the right starting point and drill into the part of the platform you care about. Navigation is now organized around Overview, Embed, JavaScript SDK, Python SDK, Integrations, API Reference, and Changelog, with a rewritten overview page and clearer Learn / Embed / Build entry points.This overhaul also adds dedicated Python SDK and LiveKit documentation, plus more focused guides for avatars, voices, LLMs, tools, session options, and network configuration.

Docs Changes

Improvements

New navigation: The docs now use clearer top-level tabs and reorganized sections so it is faster to jump between concepts, embedding, SDKs, integrations, and API reference.
New SDK and integration guides: Added dedicated Python SDK documentation and a full LiveKit integration section, including overview, quickstart, and configuration guides.
Focused concept pages: Split key setup topics into dedicated pages for available LLMs, creating custom avatars, session controls, voice configuration, and network requirements.

Fixes

Docs redirects: Added redirects for renamed and legacy docs URLs so older links and indexed API-reference pages are less likely to land on 404s.
Navigation polish: Improved overview labeling, changelog labeling, and navbar behavior across the docs experience.

Lab Changes

Improvements

Sessions page: Tool calls now appear across session Analytics, Overview, Transcript, and export views, including status, arguments, results, errors, and execution time.

Persona Changes

Improvements

Client tool round-trips: Personas can now continue once your application returns a client tool result, making client-side actions easier to chain into a conversation.
Webhook tracing: Webhook tool requests now include session and correlation IDs, making it easier to trace tool calls across your own backend systems.

Fixes

Audio preprocessing resilience: Sessions now fail open if speech-enhancement preprocessing is unavailable, instead of ending unexpectedly.
Session startup reliability: Improved startup and media-timeout handling so transient processing issues are less likely to interrupt an active turn.

SDK/API Changes

Improvements

Client tool results: The JavaScript SDK now sends client tool results and errors back to the engine over the data channel, with session-scoped safeguards.
Avatar creation API: POST /v1/avatars now accepts an optional avatarModel field during avatar creation.

LabToolsAPI

A simpler tool builder and clearer API/runtime error handling

🛠️ Tool setup got much easier in the Lab

We redesigned the tool editor so webhook tools can be configured with form-based builders for headers, query params, and body params instead of raw JSON. That makes it much easier to set up tools correctly, especially for non-technical builders or teams collaborating across product and engineering.This release also includes a few practical fixes around upload limits, session behavior, and API error handling so the platform behaves more clearly when something goes wrong.

Lab Changes

Improvements

Tool editor: Rebuilt webhook tool configuration with form-based builders for headers, query params, and body params, so you no longer need to edit raw JSON for common setups.

Fixes

Connection errors: Improved LLM URL normalization and connection error messages when custom model endpoints are misconfigured.
Avatar uploads: Reduced the avatar image upload limit to match the real platform file limit and avoid failed uploads.
Session cleanup: Fixed a bug where active sessions could keep running after the player unmounted during tab switches.

SDK/API Changes

Improvements

Capacity signaling: When session capacity is exhausted, the API now returns a clearer 429 response instead of a generic failure.

Fixes

Knowledge auth: Fixed knowledge-upload auth and header handling for API callers.

SDKLabPersona

Context injection, speech detection events & voice cloning for all plans

🎯 Client-side context injection

You can now inject context into a conversation without triggering a persona response. Call addContext() in the JavaScript SDK to silently append information — like CRM data, page navigation events, or real-time application state — to the conversation history. The persona won’t respond immediately, but will have that context available the next time the user speaks.This is useful for building context-aware agents that adapt to what the user is doing in your application without interrupting the conversation flow.

🎙️ User speech detection events

The SDK now emits userSpeechStarted and userSpeechEnded events the moment voice activity is detected, before any transcription is available. Use these to build responsive “listening” indicators and other UI feedback that reacts instantly when the user begins or stops speaking.

Lab Changes

Improvements

Voice cloning for all plans: Custom voice cloning is available across plans.
Share and embed redesign: Share links and embed widgets have been consolidated into a simpler 1-to-1 model with a cleaner management interface.
Persona tools via API: The PUT persona endpoint now accepts a tool field, allowing you to attach tools to personas programmatically.

Fixes

Fixed one-shot avatar refinement timing out by making Gemini refinement non-fatal with a 35-second timeout.
Fixed knowledge upload endpoints not accepting Bearer API key authentication.
Fixed end-session race conditions with idempotent endpoint and atomic updates.

Persona Changes

Improvements

Conversation context accuracy: A new message history system tracks which text was actually spoken versus interrupted, and records tool call arguments and results. The persona now maintains accurate context after interruptions, leading to more coherent multi-turn conversations.
Audio passthrough stability: Late-arriving audio in BYO TTS sessions no longer causes unintended interruptions. Audio is buffered and played back in order, improving reliability for Pipecat and other audio passthrough integrations.

Fixes

Fixed stale video frames occasionally appearing after a response completes.

SDK/API Changes

Improvements

Context injection: New addContext() method lets you inject context into the conversation history without triggering a response (JS SDK v4.11.0).
Speech detection events: userSpeechStarted and userSpeechEnded events fire at the VAD level for instant speech detection (JS SDK v4.12.0).

PrivacyLabPersona

Adaptive bitrate streaming, zero data retention & system tools

📡 Adaptive bitrate streaming

Anam now dynamically adjusts video quality based on network conditions. When bandwidth drops, the stream adapts in real time to maintain smooth, uninterrupted video rather than freezing or dropping frames. When conditions improve, quality scales back up automatically. This is a significant improvement for users on mobile networks, VPNs, or connections with variable bandwidth.

🔒 Zero Data Retention mode

Enterprise customers can now enable Zero Data Retention on any persona. When enabled, no session data — recordings, transcripts, or conversation logs — is stored after a session ends. This applies across the full pipeline including voice and LLM data.Toggle it on from persona settings in the Lab, or set it via the API. Learn more.

Lab Changes

Improvements

System tools: Personas can now use built-in system tools. change_language switches speech recognition to a different language mid-conversation, and skip_turn pauses the persona from responding when the user needs a moment to think. Enable them from the Tools tab in Build.
Tool validation: Auto-deduplication of tool names with clearer validation error messages.
Share link management: Migrated share links to a 1-to-1 primary model with a simpler toggle interface.

Fixes

Fixed reasoning model responses getting stuck in “thinking…” state.
Fixed soft-deleted knowledge folders not restoring on document upload.
Fixed LiveKit session type classification for snake_case environment payloads.

Persona Changes

Improvements

Agora AV1 support: Agora integration now supports the AV1 video codec for better compression and quality at lower bitrates.
Multi-agent LiveKit: Audio routing now works correctly in multi-agent LiveKit rooms with multiple Anam avatars.

Fixes

Fixed tool enum type validation.

IntegrationsLabPersona

Four new integrations, Build page redesign & knowledge base overhaul

🔌 New integrations

Four new ways to use Anam avatars in your stack:Pipecat
The pipecat-anam package brings Anam avatars to Pipecat, the open-source framework for voice and multimodal AI agents. pip install pipecat-anam, add AnamVideoService to your pipeline, and you’re streaming. Use audio passthrough for full control over your own orchestration, or let Anam handle the pipeline end-to-end. GitHub repo.ElevenLabs server-side agents
Put a face on any agent you’ve built in ElevenLabs. Pass in your ElevenLabs agent ID and session token when starting a session, and Anam handles the rest, no changes to your existing ElevenLabs setup needed. Cookbook.VideoSDK
Anam is now officially supported on VideoSDK, a WebRTC platform similar to LiveKit. Built on top of the Python SDK.Framer
The Anam Avatar plugin is now on the Framer Marketplace. Drop an avatar into any Framer site without writing code.

📐 Metaxy: sample-level versioning for ML pipelines

We wrote up a deep dive on Metaxy, our open-source metadata versioning framework for multimodal data pipelines. It tracks partial data updates at the field level so teams only reprocess what actually changed. Works with orchestrators like Dagster, agnostic to compute (Ray, DuckDB, etc.). GitHub.

Lab Changes

Improvements

Build page redesign: Everything lives in Build now. Avatars, Voices, LLMs, Tools, and Knowledge are tabs within a single page. Create custom avatars, clone voices, add LLMs, and upload knowledge files without leaving the page. Knowledge is a file drop on the Prompt tab: upload a document and it’s automatically turned into a RAG tool.
Smart voice matching: One-shot avatars now auto-select a voice matching the avatar’s detected gender.
Mobile improvements: Tables replaced with cards and lists. Bottom tab bar instead of hamburger menu. Long-press context menus on persona tiles. Touch-friendly tooltips.
Knowledge base improvements: Non-blocking document deletion with pending state and rollback on error. PDF uploads restored. Stuck documents are auto-detected with retry from the UI.

Fixes

Fixed typo in thinking duration display.
Fixed sticky hover states on touch devices.

Persona Changes

Improvements

Video stability: New TWCC-based frame-drop pacer with GCC congestion control. Smoother video on constrained or variable-bandwidth connections.
Network connectivity: TURN over TLS for ICE, improving session establishment behind corporate firewalls and VPNs.

Fixes

Fixed ElevenLabs pronunciation issues with certain text patterns.
Fixed text sanitization causing incorrect punctuation in TTS output.
Fixed silent responses not being detected correctly.

SDK/API Changes

Improvements

Tool call event handlers: onToolCallStarted, onToolCallCompleted, and onToolCallFailed handlers for tracking tool execution on the client.
Documents accessed: ToolCallCompletedPayload now includes a documentsAccessed field for Knowledge Base tool calls.

Fixes

Fixed duplicate tool call completion events.

Python SDKPersonaLab

Anam Python SDK & ICE recovery

🐍 Anam Python SDK

Anam now has a Python SDK. It handles WebRTC streaming, audio/video frame delivery, and session management.What’s in the box:

Media handling — The SDK manages WebRTC connections and signalling. Connect, and you get synchronized audio and video frames back.
Multiple integration modes — Use the full pipeline (STT, LLM, TTS, Face) or bring your own TTS via audio passthrough.
Live transcriptions — User speech and persona responses stream in as partial transcripts, useful for captions or logging conversations.
Async-first — Built on Python’s async/await. Process media frames with async iterators or hook into events with decorators.

People are already building with it — rendering ascii avatars in the terminal, processing frames with OpenCV, piping audio to custom pipelines. Check the GitHub repo to get started.

Lab Changes

Improvements

Visual refresh: Updated Lab UI with new brand styling, including new typography (Figtree), refreshed color tokens, and consistent component styles across all pages.

Persona Changes

Improvements

ICE recovery grace period: WebRTC sessions now survive brief network disconnections instead of terminating immediately. The engine detects ICE connection drops and holds the session open, allowing the client to reconnect without losing conversation state.
Language configuration: You can now set a language code on your persona, ensuring the STT pipeline uses the correct language from session start.
Voice generation options: Added configurable voice generation parameters for more control over TTS output.
ElevenLabs streaming: Removed input buffering for ElevenLabs TTS, reducing time-to-first-audio for all sessions using ElevenLabs voices.

SessionsOne-ShotAPI

Session recordings & two-pass avatar refinement

🎬 Session recordings

By default, every session is now recorded and saved for 30 days. Watch back any conversation in the Lab (lab.anam.ai/sessions) to see exactly how users interact with your personas, including the full video stream and conversation flow.Recordings and transcripts are also available via API. Use GET /v1/sessions/{id}/transcript to fetch the full conversation programmatically for analytics, QA, or archival. For privacy-sensitive applications, you can disable recording in your persona config.One-shot avatar creation now refines images in two passes. Upload an image, and the system generates an initial avatar, then refines it for better likeness and expression. Available to all users.

Lab Changes

Improvements

Added speechEnhancementLevel (0-1) to voiceDetectionOptions for control over how aggressively background noise is filtered from user audio
Support for ephemeral tool IDs, so you can configure tools dynamically per session
Added delete account and organization buttons

Fixes

Fixed terminology on tools tab
Fixed RAG default parameters not being passed
Fixed custom LLM default settings

Persona Changes

Improvements

Support for Gemini thinking/reasoning models
The speechEnhancementLevel parameter now passes through via voiceDetectionOptions
Engine optimizations for lower latency under load

Fixes

Fixed GPT-5 tool calls returning errors
Fixed audio frame padding that could cause playback issues
Fixed repeated silence messages
Fixed silence breaker not responding to typed messages

AudioLLMSDK

Speech Enhancement & Reasoning Models

🎧 User Speech Enhancement

We’ve integrated ai-coustics as a preprocessing layer in our user audio pipeline. It enhances audio quality before it reaches speech detection, cleaning up background noise and improving signal clarity in real-world conditions. This reduces false transcriptions from ambient sounds and improves endpointing accuracy, especially in noisy environments like cafes, offices, or outdoor settings.

🎛️ Configurable Persona Responsiveness

Control how quickly your persona responds with voiceDetectionOptions in the persona config:

endOfSpeechSensitivity (0-1): How eager the persona is to jump in. 0 waits until it’s confident you’re done talking, 1 responds sooner.
silenceBeforeSkipTurnSeconds: How long before the persona prompts a quiet user.
silenceBeforeSessionEndSeconds: How long silence ends the session.
silenceBeforeAutoEndTurnSeconds: How long a mid-sentence pause waits before the persona responds.

🧠 Reasoning Model Support

Added support for OpenAI reasoning models and custom Groq LLMs. Reasoning models can think through complex scenarios before responding, while Groq’s high-throughput infrastructure makes these typically-slower models respond with conversational latencies suitable for real-time interactions. Add your reasoning model in the lab: https://lab.anam.ai/llms.

Persona Changes

Fixes

Fixed Knowledge Base (RAG) tool calling with proper default query parameters
Fixed panic crashes when sessions error during startup

Lab Changes

Fixes

Fixed Powered by Anam text visibility when watermark removal is enabled
Updated API responses for GET/UPDATE persona endpoints

SDK/API Changes

Improvements

Introduced agent audio input streaming for BYO audio workflows, allowing you to integrate with arbitrary voice agents, eg ElevenLabs agents (see the ElevenLabs server-side agents recipe on how to integrate).
Added WebRTC reasoning event handlers for reasoning model support

Avatar ModelComplianceIntegrations

Cara 3 Avatar Model & SOC-2 Compliance

🎭 Introducing Cara 3: our most expressive model yet

The accumulation of over 6 months of research, Cara 3 is now available. This new model delivers significantly more expressive avatars featuring realistic eye movement, more dynamic head motion, smoother transitions in and out of idling, and improved lip sync.You can opt-in to the new model in your persona config using avatarModel: 'cara-3' or by selecting it in the Lab UI. Note that all new custom avatars will use Cara 3 exclusively, while existing personas will continue to use the Cara 2 model by default unless explicitly updated.

🛡️ SOC-2 Type II compliance

Anam has achieved SOC-2 Type II compliance. This milestone validates that our security, availability, and data protection controls have been independently audited and proven over time.For customers building across learning, enablement, or live production use cases, this provides formal assurance regarding how we handle security, access, and reliability.
Visit the Trust Center

🔌 Integrations

Model Context Protocol (MCP) server
Manage your personas and avatars directly within Claude Desktop, Cursor, and other MCP-compatible clients. Use your favorite LLM-assisted tools to interact with the Anam API.Anam x ElevenLabs agents
Turn any ElevenLabs conversational AI agent into a visual avatar using Anam’s audio passthrough.
Watch the demo

Lab Changes

Improvements

UI overhaul: A redesigned Homepage and Build page make persona creation more intuitive. You can now preview voices/avatars without starting a chat and create custom assets directly within the Build flow. Sidebar and Pricing pages have also been refreshed.
Performance: Implemented Tanstack caching to significantly improve Lab responsiveness.

Fixes

Fixed a bug where client tool events were not appearing in the Build page chat.
Resolved an issue where tool calls and RAG were not passing parameters correctly.

Persona Changes

Improvements

More voices: Added ~100 new Cartesia voices (Sonic-3) and ~180 new ElevenLabs voices (Flash v2.5), covering languages and accents from all over the world.
New default LLM: kimi-k2-instruct-0905 is now available. This SOTA open-source model offers high intelligence and excellent conversational abilities. (Note: Standard kimi-k2 remains recommended for heavy tool-use scenarios).
Configurable greetings: Added skip_greeting parameter, allowing you to configure whether the persona initiates the conversation or waits for the user.
Latency reductions:
- STT optimization: We are now self-hosting Deepgram for Speech-to-Text, resulting in a ~30ms (p50) and ~170ms (p90) latency improvement.
- Frame buffering: Optimized output frame buffer, shaving off an additional ~40ms of latency per response.

Fixes

Corrected header handling to ensure reliable data center failover.
Fixed a visual artifact where Cara 3 video frames occasionally displayed random noise.
Resolved a freeze-frame issue affecting ~1% of sessions (Incident Report).

SDK/API Changes

Improvements

API gateway guide: added documentation and an example repository for routing Anam SDK traffic through your own API Gateway server. View on GitHub.

LiveKitLatency

Livekit out of Beta and new sub-latency record

🎥 Livekit out of Beta and new latency record

LiveKit integration is now generally available: drop Anam’s expressive real-time avatars into any LiveKit Agents app so your AI can join LiveKit rooms as synchronised voice + video participants.
It turns voice-only agents into face-and-voice experiences for calls, livestreams, and collaborative WebRTC spaces, with LiveKit handling infra and Anam handling the human layer. Docs

⚡ Record-breaking latency: 330 ms decrease in latency for all customers

Server-side optimisations cuts average end-to-end latency by 330 ms for all customers, thanks to cumulative engine optimisations across transcription, frame generation, and frame writing, plus upgraded Deepgram Flux endpointing for faster, best in class turn-taking without regressions in voice quality or TTS.

Lab Changes

Improvements • Overhaul to avatar video upload and management system• Upgraded default Cartesia voices to Sonic 3• Standardised voice model selection across the platformFixes • Enhanced share-link management capabilities• Corrected LiveKit persona type identification logic

Persona Changes

Improvements • Server-side optimisations to our frame buffering to reduce latency of responses by ~250ms for all personas.Fixes • Changed timeout behavior to never time out based on heartbeats; only time out when websocket is disconnected for 10 seconds or more.• Fixed intermittent issue where persona stopped responding• Set pix_fmt for video output, moving from yuvj420p (JPEG) to yuv420 color space to avoid incorrect encoding/output.• Added timeout in our silence breaking logic to prevent hangs.

Agents

Introducing Anam Agents

🚀 Introducing Anam Agents

Build and deploy AI agents in Anam that can engage alongside you.With Anam Agents, your Personas can now interact with your applications, access your knowledge, and trigger workflows directly through natural conversation. This marks Anam’s evolution from conversational Personas to agentic Personas that think, decide, and execute.

Knowledge Tools

Give your Personas access to your company’s knowledge. Upload docs to the Lab, and they’ll use semantic retrieval to integrate the right info.
Docs for Knowledge Base

Client Tools

Personas can control your interface in real time—open checkout, display modals, navigate UI, and update state by voice.
Docs for Client Tools

Webhook Tools

Connect your Personas to external APIs and services. Create tickets, fetch status, update records, or fetch live data.
Docs for Webhook Tools

Intelligent Tool Selection

Each Persona’s LLM chooses tools based on intent—not scripts.You can create/manage tools on the Tools page in the Lab and attach them to any Persona from Build.Anam Agents are available in beta for all users: https://lab.anam.ai/login

Lab Changes

Improvements

Cartesia Sonic-3 voices: the most expressive TTS model.
Voice modal expanded: 70+ languages, voice samples, Cartesia TTS now default.
Session reports work for custom LLMs.

Fixes

Prevented auto-logout when switching contexts.
Fixed race conditions in cookie handling.
Resolved legacy session token issues.
Removed problematic voices.
Corrected player/stream aspect ratios on mobile.

Persona Changes

Improvements

Deepgram Flux support for turn-taking (Deepgram Flux Details)
Server-side optimization: reduced GIL contention and latency, faster connections.

Fixes

Bug-fix for dangling LiveKit connections.

Research

Improvements

Our first open-source library!
Metaxy, a metadata layer for ML/data pipelines:
Read more | GitHub

Trust Centre

Anam is now HIPAA compliant

🛡️ Anam is now HIPAA compliant

A big milestone for our customers and partners. Anam now meets HIPAA requirements for handling protected health information.Learn more at the Anam Trust Center

Lab Changes

Improvements

Enhanced voice selection: search by use case/conversational style, 70+ languages.
Product tour update.
Streamlined One-Shot avatar creation.
Auto-generated Persona names based on selected avatar.
Session start now 1.1s faster.

Fixes

Share links: fixed extra concurrency slot usage.

Persona Changes

Improvements

Improved TTS pronunciation via smarter text chunking.
Traceability and monitoring for session IDs.
Increased internal audio sampling rate to 24kHz.
Increased max websocket size to 16Mb.

Fixes

Concurrency calculation now only considers sessions from last 2 hours.
Less freezing for slower LLMs.

AnalyticsLab

Session Analytics

📊 Session Analytics

Once a conversation ends, how do you review what happened? To help you understand and improve your Persona’s performance, we’re launching Session Analytics in the Lab. Now you can access a detailed report for every conversation, complete with a full transcript, performance metrics, and AI-powered analysis.

Full Conversation Transcripts. Review every turn of a conversation with a complete, time-stamped transcript. See what the user said and how your Persona responded, making it easy to diagnose issues and identify successful interaction patterns.
Detailed Analytics & Timeline. Alongside the transcript, a new Analytics tab provides key metrics grouped into “Transcript Metrics” (word count, turns) and “Processing Metrics” (e.g., LLM latency). A visual timeline charts the entire conversation, showing who spoke when and highlighting any technical warnings.
AI-Powered Insights. For a deeper analysis, you can generate an AI-powered summary and review key insights. This feature, currently powered by gpt-5-mini, evaluates the conversation for highlights, adherence to the system prompt, and user interruption rates.

You can find your session history on the Sessions page in the Lab. Click on any past session to explore the new analytics report. This is available today for all session types, except for LiveKit sessions. For privacy-sensitive applications, session logging can be disabled via the SDK.

Lab Changes

Improvements

Improved Voice Discovery: The Voices page has been updated to be more searchable, allowing you to preview voices with a single click, and view new details like gender, TTS-model and language.

Fixes

Fixed share-link session bug: Fixed bug of share-link sessions taking an extra concurrency slot.

Persona Changes

Improvements

Small improvement to connection time: Tweaks to how we perform webrtc signalling which allows for slightly faster connection times (~900ms faster for p95 connection time).
Improvement to output audio quality for poor connections: Enabled Opus in-band FEC to improve audio quality under packet loss.
Small reduction in network latency: Optimisations have been made to our outbound media streams to reduce A/V jitter (and hence jitter buffer delay). Expected latency improvement is modest (<50ms).

Fixes

Fix for livekit sessions with slow TTS audio: Stabilizes LiveKit streaming by pacing output and duplicating frames during slowdowns to prevent underflow.

PerformanceLLM

Intelligent LLM Routing for Faster Responses

⚡ Intelligent LLM Routing for Faster Responses

The performance of LLM endpoints can be highly variable, with time-to-first-token latencies sometimes fluctuating by as much as 500ms from one day to the next depending on regional load. To solve this and ensure your personas respond as quickly and reliably as possible, we’ve rolled out a new intelligent routing system for LLM requests. This is active for both our turnkey customers and for customers using their own server-side Custom LLMs if they deploy multiple endpoints.This new system constantly monitors the health and performance of all configured LLM endpoints by sending lightweight probes at regular intervals. Using a time-aware moving average, it builds a real-time picture of network latency and processing speed for each endpoint. When a request is made, the system uses this data to calculate the optimal route, automatically shedding load from any overloaded or slow endpoints within a region.

Lab Changes

Improvements

Generate one-shot avatars from text prompts: You can now generate one-shot avatars from text prompts within the lab, powered by Gemini’s new Nano Banana model. The one-shot creation flow has been redesigned for speed and ease-of-use. Text-to-Avatar and Image Upload Avatars are available across plans, subject to each plan’s custom avatar allowance.
Improved management of published embed widgets: Published embed widgets can now be configured and monitored from the lab at https://lab.anam.ai/personas/published.

Persona Changes

Improvements

Automatic failover to backup data centres: To ensure maximum uptime and reliability for our personas, we’ve implemented automatic failover to backup data centres.

Fixes

Prevent session crash on long user speech: Previously, unbroken user speech exceeding 30 seconds would trigger a transcription error and crash the session. We now automatically truncate continuous speech to 30 seconds, preventing sessions from failing in these rare cases.
Allow configurable session lengths for Growth and above: We had a bug where longer sessions could hit an outdated timeout. This has now been fixed.
Resolved slow connection times caused by incorrect database region selection: An undocumented issue with our database provider led to incorrect region selection for our databases. Simply refreshing our credentials resolved the problem, resulting in a ~1s improvement in median connection times and ~3s faster p95 times. While our provider works on a permanent fix, we’re actively monitoring for any recurrence.

EmbedLab

Embed Widget

Embed personas directly into your website with our new widget. Within the lab’s build page click Publish then generate your unique html snippet. This snippet will work in most common website builders, eg Wordpress.org or SquareSpace.For added security we recommend adding a whitelist with your domain url. This will lock down the persona to only work on your website. You can also cap the number of sessions or give the widget an expiration period.

Lab Changes

Improvements

ONE-SHOT avatars available via API: Accounts can now create one-shot avatars via API, subject to each plan’s custom avatar allowance. Docs here.
Spend caps: It’s now possible to set a spend cap on your account. Available in profile settings.

Persona Changes

Fixes

Prevent Cartesia from timing out when using slow custom LLMs: We’ve added a safeguard to prevent Cartesia contexts from unexpectedly closing during pauses in text streaming. With slower llms or if there’s a break or slow-down in text being sent, your connection will now stay alive, ensuring smoother, uninterrupted interactions.

For full legal and policy information, see:

​Spotlight: Invite personas to your meetings

​Lab Changes

​SDK/API Changes

​Spotlight: Faster time to first frame for persona sessions

​Lab Changes

​Persona Changes

​SDK/API Changes

​Spotlight: Refreshed Lab and better session controls

​Lab Changes

​Persona Changes

​SDK/API Changes

​Spotlight: Cara 4 early access

​Lab Changes

​Persona Changes

​SDK/API Changes

​⚡ More predictable session openings

​Lab Changes

​Persona Changes

​SDK/API Changes

​📚 The Anam docs have been overhauled

​Docs Changes

​Lab Changes

​Persona Changes

​SDK/API Changes

​🛠️ Tool setup got much easier in the Lab

​Lab Changes

​SDK/API Changes

​🎯 Client-side context injection

​🎙️ User speech detection events

​Lab Changes

​Persona Changes

​SDK/API Changes

​📡 Adaptive bitrate streaming

​🔒 Zero Data Retention mode

​Lab Changes

​Persona Changes

​🔌 New integrations

​📐 Metaxy: sample-level versioning for ML pipelines

​Lab Changes

​Persona Changes

​SDK/API Changes

​🐍 Anam Python SDK

​Lab Changes

​Persona Changes

​🎬 Session recordings

​🎨 Two-pass avatar refinement

​Lab Changes

​Persona Changes

​🎧 User Speech Enhancement

​🎛️ Configurable Persona Responsiveness

​🧠 Reasoning Model Support

​Persona Changes

​Lab Changes

​SDK/API Changes

​🎭 Introducing Cara 3: our most expressive model yet

​🛡️ SOC-2 Type II compliance

​🔌 Integrations

​Lab Changes

​Persona Changes

​SDK/API Changes

​🎥 Livekit out of Beta and new latency record

​⚡ Record-breaking latency: 330 ms decrease in latency for all customers

​Lab Changes

​Persona Changes

​🚀 Introducing Anam Agents

​Knowledge Tools

​Client Tools

​Webhook Tools

​Intelligent Tool Selection

​Lab Changes

​Persona Changes

​Research

​🛡️ Anam is now HIPAA compliant

​Lab Changes

​Persona Changes

​📊 Session Analytics

​Lab Changes

​Persona Changes

​⚡ Intelligent LLM Routing for Faster Responses

​Lab Changes

Spotlight: Invite personas to your meetings

Lab Changes

SDK/API Changes

Spotlight: Faster time to first frame for persona sessions

Lab Changes

Persona Changes

SDK/API Changes

Spotlight: Refreshed Lab and better session controls

Lab Changes

Persona Changes

SDK/API Changes

Spotlight: Cara 4 early access

Lab Changes

Persona Changes

SDK/API Changes

⚡ More predictable session openings

Lab Changes

Persona Changes

SDK/API Changes

📚 The Anam docs have been overhauled

Docs Changes

Lab Changes

Persona Changes

SDK/API Changes

🛠️ Tool setup got much easier in the Lab

Lab Changes

SDK/API Changes

🎯 Client-side context injection

🎙️ User speech detection events

Lab Changes

Persona Changes

SDK/API Changes

📡 Adaptive bitrate streaming

🔒 Zero Data Retention mode

Lab Changes

Persona Changes

🔌 New integrations

📐 Metaxy: sample-level versioning for ML pipelines

Lab Changes

Persona Changes

SDK/API Changes

🐍 Anam Python SDK

Lab Changes

Persona Changes

🎬 Session recordings

🎨 Two-pass avatar refinement

Lab Changes

Persona Changes

🎧 User Speech Enhancement

🎛️ Configurable Persona Responsiveness

🧠 Reasoning Model Support

Persona Changes

Lab Changes

SDK/API Changes

🎭 Introducing Cara 3: our most expressive model yet

🛡️ SOC-2 Type II compliance

🔌 Integrations

Lab Changes

Persona Changes

SDK/API Changes

🎥 Livekit out of Beta and new latency record

⚡ Record-breaking latency: 330 ms decrease in latency for all customers

Lab Changes

Persona Changes

🚀 Introducing Anam Agents

Knowledge Tools

Client Tools

Webhook Tools

Intelligent Tool Selection

Lab Changes

Persona Changes

Research

🛡️ Anam is now HIPAA compliant

Lab Changes

Persona Changes

📊 Session Analytics

Lab Changes

Persona Changes

⚡ Intelligent LLM Routing for Faster Responses

Lab Changes