What Should an Avatar Look Like in Healthcare Training?
People keep asking a reasonable question. Should healthcare training avatars look more realistic?
The research answer is messier than the sales answer.
More realism can help. It can also backfire. The better question isn’t “how close can we get to a real person?” It’s “what level of realism helps the learner take the encounter seriously, speak naturally, practice safely, and improve?“
That answer depends on the task.
The "almost real, but off" problem
A 2025 network meta-analysis in Frontiers in Psychology compared low, medium, and high-realism avatars on attractiveness, trust, and eeriness.
High-realism avatars won on attractiveness and trust. But medium-realism avatars won on eeriness, which is the zone people mean when they say an avatar looks “almost real, but off” [1].
That zone has a name. In 1970, Masahiro Mori called it the uncanny valley [2]. When an avatar looks nearly human, people expect human-level gaze, timing, voice, and emotion. If any of those are wrong, the user notices.
The problem isn’t realism itself. It’s unfinished realism. A photorealistic face that doesn’t move or react quite right promises something the system can’t deliver, and the learner feels it.
A 2025 systematic review of conversational avatars (the ones closer to virtual patients) screened nearly 22,000 papers and ended with 29 included studies. The authors proposed a design checklist covering appearance, verbal behavior, nonverbal behavior, and social and cultural norms [3]. The point: avatar design isn’t one decision. It’s a promise about what the avatar will do once the conversation starts.
A cartoon avatar promises, “I’m artificial.” A photoreal avatar promises, “I’m close to human.“
A healthcare simulation platform has to pick a promise it can keep.
Realism is only one part of fidelity
The field has been moving away from the idea that more realistic equals better learning.
Hamstra and colleagues argued in 2014 that fidelity isn’t a single high-versus-low scale. They proposed dropping the term in favor of two underlying ideas, physical resemblance and functional task alignment, and pointed to transfer of learning, engagement, and suspension of disbelief as the educational concepts that really drive outcomes [4]. A 2026 JMIR Medical Education viewpoint goes further, treating fidelity as physical, emotional, and contextual dimensions, plus qualitative and quantitative considerations [5].
That’s the right lens for avatars.
A more realistic face may raise the seriousness of the encounter. It also raises expectations. If the avatar doesn’t move, listen, react, and emote at the level its appearance implies, visual fidelity becomes a liability.
The real question is whether the avatar has the right kind of fidelity for the clinical task.
A dermatology counseling case may need visible skin findings. A psychiatric risk assessment may need silence, shame, and guarded disclosure. An opioid use disorder conversation may need a patient who can get defensive, mistrustful, or relieved depending on how the clinician speaks. A procedural orientation may not need much realism at all.
Match the realism to the behavior you’re training.
That table is shorthand. The reasoning behind each row is below.
Cartoon avatars
Cartoons sidestep the uncanny valley because they don’t ask to be judged as nearly human. Users forgive imperfect mouth movement, simplified emotion, and limited facial range.
That’s useful in early learning, wellness, pediatrics, low-stakes orientation, and anywhere psychological safety matters more than clinical gravity.
The tradeoff is obvious. A cartoon patient is harder to take seriously in opioid use disorder screening, end-of-life conversations, breaking bad news, or trauma-informed care. The learner may engage. They may not engage with the same emotional weight.
Cartoons aren’t wrong. They’re task-limited.
Semi-realistic and game-quality avatars
For most clinical training cases, the most defensible default is probably a polished semi-realistic or game-quality avatar.
It looks like a credible person without pretending to be one. That gives learners enough presence to speak naturally, ask sensitive questions, and respond to emotion. It also keeps enough distance for learners to make mistakes without feeling watched.
That distance may be part of the learning value. Older work from USC’s Institute for Creative Technologies found people disclosed more to a virtual human when they thought it was computer-controlled rather than human-operated. The likely reason: less fear of being judged [6].
Learners need that room too. They need to try, stumble, revise, and try again.
The danger isn’t semi-realism itself. It’s low-quality semi-realism. A polished semi-realistic avatar can work well. A stiff, plastic, almost-human avatar slides into the uncanny zone.
A small BMC Medical Education pilot in 2025 with 11 students using VR virtual patients found comfort, communication confidence, and decision-making confidence all rose after two sessions [7]. The study is small and self-reported. It doesn’t compare avatar styles or prove transfer to real practice. But it shows something useful: learners’ first impressions of avatars aren’t stable. They change once the avatar starts being useful.
That’s why demo reactions are a weak evaluation method. A demo tests visual preference. It doesn’t test learning value.
Photorealistic avatars
Realism can build credibility. A 2025 study by Baake and colleagues found that higher-realism avatars were rated as more trustworthy than cartoon ones, with no clear uncanny effect in their setting [8].
That fits the broader pattern. In short, informational settings, high realism helps if the rendering is strong.
But clinical training isn’t information delivery. It’s an interaction. Once the learner starts speaking, appearance is one part of the experience. Turn-taking, voice quality, emotional response, and latency start to matter more.
A 2025 mixed-reality study of 14 coworkers using HoloLens 2 over two weeks compared realistic and cartoon faces. Realistic faces created higher expectations and more mood-perception errors. Participants said words, tone of voice, and movement were the most useful cues for reading mood, regardless of avatar style [9].
That should give educators pause. If a realistic face can’t reliably carry emotion, it may be worse than a less realistic face with clearer behavioral cues.
Hyperrealistic and digital twin avatars
These can work, but the use case has to be right.
The strongest recent example is a 2025 Journal of Clinical Medicine pilot from Mayo Clinic. Thirty plastic surgery patients used a hyperrealistic AI avatar of their own surgeon for postoperative education. The avatar answered 297 of 300 queries correctly. Usability scored 87.7. All participants found it trustworthy. Eeriness was low at 1.57 out of 5 [10].
The most interesting finding wasn’t the score. It was the disclosure result. Being upfront that the avatar was AI made patients trust it more, not less [10].
That’s important, and it shouldn’t be overgeneralized. This was a digital version of a known surgeon, backed by Mayo, used for controlled education, with no open-ended clinical improvisation. Patients were told it was AI. Content scope was narrow.
The lesson isn’t “hyperrealistic is best.” It’s that hyperrealistic physician avatars can be accepted when trust already exists, the institution is credible, the content is controlled, and the AI is disclosed.
That’s different from learner-facing virtual patients, where the avatar is a practice partner.
What about AI virtual patients?
Virtual patients aren’t new. Cook and colleagues published a systematic review and meta-analysis in 2010 [11], and Kononowicz and colleagues published a larger one in 2019 [12]. What’s new is the ability to make virtual patients conversational, adaptive, and feedback-driven using large language models.
The literature is moving fast but the field is still early. A 2025 JMIR scoping review of LLM-based virtual patients found rapid growth but wide variation in design, evaluation, and outcomes [13].
A 2025 multicenter randomized crossover study compared an AI-driven virtual patient simulator with actor-based training across two UK medical schools. Both improved self-rated communication skills. The AI version produced a smaller improvement and lower satisfaction. It also cost less per student. The authors framed it as a cost-effective complement, not a replacement [14].
A 2025 study by Cook and colleagues looked specifically at LLM-powered virtual patients. They could simulate dialogues, represent patient preferences, and give personalized feedback at low cost [15]. The same study found real limitations: verbosity, unusual vocabulary, excessive agreeableness, weak pushback on poor clinician performance, and feedback that drifts too positive or inaccurate [15].
The failure mode isn’t only how the patient looks. It’s whether the patient behaves like a patient.
The goal isn’t imitation. It’s fit. Use AI virtual patients where they’re strongest: repeated practice, safe failure, immediate feedback, transcript review, and scalable assessment. Use human simulation where live emotional nuance matters most.
What appearance can't carry on its own
A static portrait tells you almost nothing about whether an avatar will work. Demo screenshots are the wrong unit of analysis. What matters is what happens once the avatar moves, speaks, listens, and reacts.
Three behavioral layers do the real work.
Body language that matches the words. A patient describing chest pain who sits perfectly still reads as artificial. The same patient who leans forward, guards their torso, and shifts weight reads as someone the learner has to attend to.
Facial emotion that’s reliable. Patients minimize pain. They smile when afraid. They go flat when depressed. They get defensive when they feel judged. A learner who can’t read those signals is rehearsing the wrong skill. The 2025 mixed-reality study found realistic faces that failed to carry emotion reliably produced more mood errors than cartoon faces, because users kept trying to read information that wasn’t there [9].
Face, voice, and body that belong to the same person. When a patient says “I’ve been more tired than usual” with a generic smile, the learner’s brain registers the mismatch. A 2011 study found that a gap between the realism of a character’s face and voice produces an uncanny response on its own [16].
The 2025 conversational agent review makes the same point at the design level: appearance, verbal behavior, nonverbal behavior, and social norms have to be designed as a single system [3].
This is where the avatar debate goes wrong. A higher-resolution face doesn’t fix mismatched affect. Spending more on visual realism without the behavioral layer makes the uncanny problem worse, not better.
The learning value sits in the range
The strongest virtual patients aren’t the ones with the most realistic skin. They’re the ones with the most useful behavioral range.
A virtual patient who can be guarded and then open up gives the learner something to practice. One who gets defensive, then softens after a nonjudgmental reflection, gives the learner a real communication problem. One who minimizes symptoms until the clinician asks the right follow-up teaches clinical persistence.
That’s where motivational interviewing gets rehearsed. That’s where empathy gets tested. That’s where clinical reasoning becomes observable.
The visual is the entry point. The behavioral range is the product.
This is the standard we apply to the virtual patients on the Xuron platform, and it’s also the standard we’d suggest buyers apply to anything they’re evaluating.
Standards push the same way
Healthcare simulation standards point in the same direction. The INACSL Healthcare Simulation Standards of Best Practice are built around purposeful design, prebriefing, psychological safety, assessment, feedback, and debriefing [17]. The Healthcare Simulation Dictionary, developed with AHRQ and SSH, gives the field shared terms [18].
The takeaway for buyers: avatar design isn’t separate from educational design. Better questions than “Does it look real?” include:
- Does the appearance match the seriousness of the case?
- Does the voice match the face?
- Does the face match the emotion?
- Does the body match the words?
- Does the patient’s behavior change based on the learner’s behavior?
- Does the system create enough psychological safety for practice?
- Does the feedback measure the behaviors the course is trying to change?
Those questions matter more than how the screenshot looks.
What the research says overall
Realism can raise trust and seriousness in professional settings [1, 8]. Medium or unfinished realism creates discomfort because users expect human behavior and notice mismatches [1, 3]. In interaction-heavy settings, behavior matters more than appearance. Voice, timing, movement, emotional clarity, and feedback aren’t side details. They are the experience [3, 9, 15, 16].
Fidelity isn’t a single ladder. The useful question is whether the simulation has the right physical, emotional, contextual, and functional fidelity for the objective [4, 5]. Miller’s pyramid says the same thing in a different way: knowing, knowing how, showing, and doing are different layers [19]. Most avatar conversations stop at the first two.
A recent visual review of the uncanny valley in medical simulation makes a related point: fully integrated, production-ready realistic virtual humans for medical simulation are still an unsolved design challenge [20].
For most healthcare training, more realistic isn’t automatically better. The right default is a polished semi-realistic or game-quality avatar that feels clinically credible without pretending to be a real person. Use cartoons for psychological safety and onboarding. Use photorealism selectively where role credibility matters and the system can carry the higher expectation. Use hyperrealistic digital twins for narrow, disclosed, institution-backed cases.
The right avatar isn’t the most realistic one. It’s the one whose appearance, voice, timing, emotion, behavior, and feedback are real enough for the task without making a promise the system can’t keep.
Buyers should be asking less about how the avatar looks and more about what happens after the learner starts talking. That’s where the learning shows up. That’s where the next round of evidence is going to come from.
References
- Tao Z, Liu Y, Qiu J, Li S. Impact of virtual avatar appearance realism on perceptual interaction experience: a network meta-analysis. Front Psychol. 2025;16:1624975. doi:10.3389/fpsyg.2025.1624975
- Mori M. The uncanny valley. Energy. 1970;7(4):33-35. Translated by MacDorman KF, Kageki N. IEEE Robotics & Automation Magazine. 2012;19(2):98-100. doi:10.1109/MRA.2012.2192811
- Cihodaru-Ștefanache Ș, Podina IR. The uncanny valley effect in embodied conversational agents: a critical systematic review of attractiveness, anthropomorphism, and uncanniness. Front Psychol. 2025;16:1625984. doi:10.3389/fpsyg.2025.1625984
- Hamstra SJ, Brydges R, Hatala R, Zendejas B, Cook DA. Reconsidering fidelity in simulation-based training. Acad Med. 2014;89(3):387-392. doi:10.1097/ACM.0000000000000130
- Pico J, Evain JN, Aron C, Martin G, Cruz-Panesso I, Georgescu LM, Tanoubi I. From realism to learner engagement: rethinking fidelity in simulation-based education. JMIR Med Educ. 2026;12:e84684. doi:10.2196/84684
- Lucas GM, Gratch J, King A, Morency LP. It’s only a computer: virtual humans increase willingness to disclose. Comput Human Behav. 2014;37:94-100. doi:10.1016/j.chb.2014.04.043
- Dávidovics A, Dávidovics K, Hillebrand P, Rendeki S, Németh T. Virtual patient simulation to enhance medical students’ clinical communication and decision-making skills: a pilot study. BMC Med Educ. 2025;26(1):171. doi:10.1186/s12909-025-08507-7
- Baake J, Schmitt JB, Metag J. Balancing realism and trust: AI avatars in science communication. J Sci Commun. 2025;24(2):A03. doi:10.22323/2.24020203
- Dobre GC, Wilczkowiak M, Gillies M, Pan X, Rintel S. Avatars in mixed-reality meetings: a longitudinal field study of realistic versus cartoon facial likeness effects on communication, task satisfaction, presence, and emotional perception. Int J Hum Comput Stud. 2025;205:103632. doi:10.1016/j.ijhcs.2025.103632
- Haider SA, Prabha S, Gomez-Cabello CA, et al. Artificial intelligence physician avatars for patient education: a pilot study. J Clin Med. 2025;14(23):8595. doi:10.3390/jcm14238595
- Cook DA, Erwin PJ, Triola MM. Computerized virtual patients in health professions education: a systematic review and meta-analysis. Acad Med. 2010;85(10):1589-1602. doi:10.1097/ACM.0b013e3181edfe13
- Kononowicz AA, Woodham LA, Edelbring S, et al. Virtual patient simulations in health professions education: systematic review and meta-analysis by the Digital Health Education Collaboration. J Med Internet Res. 2019;21(7):e14676. doi:10.2196/14676
- Zeng J, Qi W, Shen S, et al. Embracing the future of medical education with large language model-based virtual patients: scoping review. J Med Internet Res. 2025;27:e79091. doi:10.2196/79091
- Tyrrell EG, Sandhu SK, Berry K, et al. Web-based AI-driven virtual patient simulator versus actor-based simulation for teaching consultation skills: multicenter randomized crossover study. JMIR Form Res. 2025;9:e71667. doi:10.2196/71667
- Cook DA, Overgaard J, Pankratz VS, Del Fiol G, Aakre CA. Virtual patients using large language models: scalable, contextualized simulation of clinician-patient dialogue with feedback. J Med Internet Res. 2025;27:e68486. doi:10.2196/68486
- Mitchell WJ, Szerszen KA, Lu AS, Schermerhorn PW, Scheutz M, MacDorman KF. A mismatch in the human realism of face and voice produces an uncanny valley. i-Perception. 2011;2(1):10-12. doi:10.1068/i0415
- INACSL Standards Committee. Healthcare Simulation Standards of Best Practice®. Clin Simul Nurs. 2025. Available at https://www.inacsl.org
- Lioce L, Lopreiato J, Anderson M, et al, eds; Terminology and Concepts Working Group. Healthcare Simulation Dictionary, Third Edition. Rockville, MD: Agency for Healthcare Research and Quality; January 2025. AHRQ Publication No. 24-0077.
- Miller GE. The assessment of clinical skills/competence/performance. Acad Med. 1990;65(9 Suppl):S63-S67. doi:10.1097/00001888-199009000-00045
- Grigoriou E, Kamarianakis M, Papagiannakis G. The uncanny valley in medical simulation-based training: a visual summary. arXiv. 2025. arXiv:2512.24240