Behind the Scenes: How Synthesia’s AI Avatars Are Made

Written by

Lousine Boyakhandjyan

Published on

April 2, 2025

Table of contents

Text Link

Turn your texts, PPTs, PDFs or URLs to video - in minutes.

Learn more

Ever wondered how Synthesia’s AI avatars come to life?

In a Synthesia Behind the Scenes session, we sat down with Tosin Oshinyemi (Lead Avatar Producer) and Josh Baker-Mendoza (Technical Supervisor) to uncover the creative and technical magic behind avatar production.

From actor selection and performance coaching to AI rendering and motion tracking, we dove deep into the process that powers Synthesia’s cutting-edge avatars.

Watch our behind the scenes session

The art and science behind AI avatars

AI avatars may be driven by technology, but as Tosin and Josh emphasized, they remain deeply human at their core. Every detail—expressions, gestures, and nuances—comes from real performances, captured through meticulous production techniques.

“We’re making tools that help people connect, teach, and inspire in ways they never could before.” — Tosin Oshinyemi

Casting the right talent: ensuring diversity and versatility

Creating an avatar library that serves a broad range of use cases requires careful actor selection. Tosin explained how Synthesia ensures diversity and usability by considering:

Distinct yet versatile appearances – Avatars must reflect a variety of backgrounds, ages, and styles to resonate with different audiences.
Performance range – Actors are chosen based on their ability to express natural emotions and gestures that can suit multiple content types.
Industry relevance – Some avatars are designed specifically for corporate training, while others are tailored for marketing, education, or healthcare content.
User feedback – The team regularly assesses requests from customers and adjusts casting decisions accordingly.

Synthesia’s approach involves a mix of agency partnerships, open casting calls, and street casting to ensure a well-rounded library of avatars. Sometimes, actors are scouted based on their unique ability to bring personality and warmth to an AI-driven experience.

Filming and studio setup: the technology behind the avatars

Once actors are selected, they enter a carefully controlled filming environment to capture the footage that becomes their AI avatar. Josh, broadcasting live from Synthesia’s London studio, walked us through the technical setup that makes these avatars possible:

Lighting: Using 3-point lighting with soft diffusion to create balanced visuals and reduce harsh shadows.
Camera Setup: High-resolution capture using 4K RAW for precision, ensuring that details like facial expressions translate accurately.
Backgrounds: For Express-1, you’ll need to record footage with a green or blue screen background (preferably a green screen). For Personal Avatars, a minimalist approach can maximize the potential use cases of an avatar, but it all depends on the main intended use case(s) for the avatar (albeit hyper detailed, dense backgrounds won’t play nicely with the tech).

Josh emphasized that the studio setup isn’t just about aesthetics—it directly affects the realism and adaptability of the final avatars. The goal is to produce avatars that fit seamlessly into any digital environment, whether it’s a corporate training video, a marketing campaign, or an educational module.

Performance best practices for AI avatar creation

Tosin shared some key insights into how to achieve the best performance when filming footage for AI avatars:

Speak naturally, not mechanically – The most compelling avatars feel like real people, not robotic readers.
Engage with an imagined audience – Instead of just reading from a script, actors should picture speaking to a real person.
Use microexpressions and body language – Even subtle nods, head tilts, and natural facial movements enhance realism.
Avoid exaggerated movements – While expression is essential, overly large gestures can appear unnatural in AI avatars.
Do multiple takes – The best performances often emerge after a few practice rounds.

“Your avatar should feel like you, not a stiff, robotic version of you. Be expressive but stay natural.” — Tosin Oshinyemi

Bringing avatars to life with AI

Once the footage is captured, Synthesia’s AI technology analyzes facial expressions, movements, and speech patterns, mapping them onto digital avatars. Josh highlighted the critical role of optical flow algorithms and speech-to-expression mapping, which allow avatars to maintain fluid, lifelike animations.

Josh also detailed the technical infrastructure behind avatar creation:

2D-Based Capture: While volumetric (4D) capture exists, Synthesia’s avatars rely on high-resolution single-camera, 2D-based capture, making production scalable, efficient, and accessible.
Speech-to-Expression Mapping: AI interprets speech input and generates subtle microexpressions to enhance realism.
Intentional Lighting: The studio setup includes soft light diffusion, minimizing harsh shadows and ensuring avatars integrate seamlessly into various backgrounds.

The future of AI avatars is also evolving rapidly. Josh teased upcoming advancements, including full-body motion tracking and adaptive AI-driven gestures, making avatars even more dynamic and responsive.

“It’s easy to get caught up in the tech, but at the end of the day, what we do is about people. AI is just a tool that lets us communicate better.” — Josh Baker-Mendoza

A passionate community at the heart of innovation

Feedback from Synthesia creators plays a critical role in shaping Synthesia’s future development. Every new feature and improvement is informed by real needs, ensuring that AI avatars continue to feel natural, engaging, and truly human.

During the live interview, members of Synthesia’s AI Video Creator Community were eager to share which avatars they use most frequently, highlighting how they match different avatars to specific content types—whether for well-being topics, leadership training, or technical instruction.

Others shared their enthusiasm for creating personal avatars, emphasizing how having a digital representation of themselves enhances engagement and personalization in workplace training and communication.

Key takeaways

AI avatars start with real human performances: the tech only enhances what’s already there.
Lighting and camera quality are crucial: a high-quality recording results in a more realistic avatar.
Performance direction matters: the most engaging avatars feel natural, not robotic.
Exciting updates are on the way: Josh hinted at Express-2 avatars, which will feature even more natural movement and speech synchronization.!

About the author

Customer Enablement Manager

Lousine Boyakhandjyan

Lousine Boyakhandjyan is a Customer Enablement Manager at Synthesia, as well as the host of Synthesia’s AI Video Creator Community.

Go to author's profile

View all posts

Synthesia News

Synthesia launches talent experience program to learn from and reward the exceptional actors behind its stock avatars

The talent experience program is designed to create a forum for open dialogue between Synthesia and the exceptional actors behind our most popular stock AI avatars. We want to listen to their ideas and concerns, and improve our platform and products. Participants will be able to see new technologies and use cases before they're released and provide their feedback.

Synthesia News

Personal Avatars allow you to create custom AI avatars with a natural background in minutes

Synthesia News

Synthesia’s experimental Selfie Avatars offer a glimpse into the future of avatar customization

Today, we’re thrilled to introduce an experimental feature that will take the AI avatar experience to the next level: Selfie Avatars. With just a few images of yourself, you can create a custom avatar that opens up new possibilities for expression and communication.

Synthesia News

Introducing Futuresafe, our new responsible creation report

Today, we’re publishing the first edition of our yearly transparency report, providing insights into how we ensure our platform is being used responsibly, in compliance with existing online safety regulations and our own policies.

How to guides

AI for Sales Prospecting: Personalized Video That Converts at Scale

Scale sales outreach with AI video. Learn how personalized video boosts response rates and helps teams stand out in crowded inboxes.

How to guides

Talent Sourcing: How Personalized Video Can Improve Candidate Response Rates

Boost candidate response rates with personalized video. A smarter way to scale your recruitment sourcing and stand out in crowded inboxes.

faq

Frequently asked questions

Ready to try our AI video platform?

Join over 1M+ users today and start making AI videos with 230+ avatars in 140+ languages.

Behind the Scenes: How Synthesia’s AI Avatars Are Made

The art and science behind AI avatars

Casting the right talent: ensuring diversity and versatility

Filming and studio setup: the technology behind the avatars

Performance best practices for AI avatar creation

Bringing avatars to life with AI

A passionate community at the heart of innovation

Key takeaways

You might also like

​Synthesia launches talent experience program to learn from and reward the exceptional actors behind its stock avatars

Personal Avatars allow you to create custom AI avatars with a natural background in minutes

Synthesia’s experimental Selfie Avatars offer a glimpse into the future of avatar customization

Introducing Futuresafe, our new responsible creation report

AI for Sales Prospecting: Personalized Video That Converts at Scale

Talent Sourcing: How Personalized Video Can Improve Candidate Response Rates

Frequently asked questions

Ready to try our AI video platform?

Synthesia launches talent experience program to learn from and reward the exceptional actors behind its stock avatars