Seeing the Sound: Multilingual Lip Sync for Real-Time Face-to-Face Translation DOI Creative Commons
Amirkia Rafiei Oskooei, Mehmet S. Aktaş,

Mustafa Keleş

и другие.

Computers, Год журнала: 2024, Номер 14(1), С. 7 - 7

Опубликована: Дек. 28, 2024

Imagine a future where language is no longer barrier to real-time conversations, enabling instant and lifelike communication across the globe. As cultural boundaries blur, demand for seamless multilingual has become critical technological challenge. This paper addresses lack of robust solutions face-to-face translation, particularly low-resource languages, by introducing comprehensive framework that not only translates but also replicates voice nuances synchronized facial expressions. Our research tackles primary challenge achieving accurate lip synchronization culturally diverse filling significant gap in literature evaluating generalizability sync models beyond English. Specifically, we develop novel evaluation combining quantitative error metrics qualitative assessments human observers. applied assess two state-of-the-art with different architectures Turkish, Persian, Arabic using newly collected dataset. Based on these findings, propose implement modular system integrates language-agnostic neural networks deliver fully functional translation experience. Inference Time Analysis shows this achieves highly realistic, face-translated talking heads real time, throughput as low 0.381 s. transformative primed deployment immersive environments such VR/AR, Metaverse ecosystems, advanced video conferencing platforms. It offers substantial benefits developers businesses aiming build next-generation systems applications. While work focuses three its design allows scalability additional languages. However, further testing broader linguistic contexts required confirm universal applicability, paving way more interconnected inclusive world ceases hinder connection.

Язык: Английский

Cross-national differences in drivers’ eye contact and traffic violations: An online survey across 20 countries DOI Creative Commons
Joost de Winter, V. Onkhar, Dimitra Dodou

и другие.

Transportation Research Part F Traffic Psychology and Behaviour, Год журнала: 2025, Номер 109, С. 711 - 725

Опубликована: Янв. 8, 2025

Язык: Английский

Процитировано

1

ChatGPT and academic work: new psychological phenomena DOI Creative Commons
Joost de Winter, Peter A. Hancock, Yke Bauke Eisma

и другие.

AI & Society, Год журнала: 2025, Номер unknown

Опубликована: Март 17, 2025

Язык: Английский

Процитировано

1

“Foundation Models for Research: a Matter of Trust?” DOI Creative Commons

Koen Bruynseels,

Lotte Asveld,

Jeroen van den Hoven

и другие.

Artificial Intelligence in the Life Sciences, Год журнала: 2025, Номер unknown, С. 100126 - 100126

Опубликована: Фев. 1, 2025

Язык: Английский

Процитировано

0

ChatGPT-4o and 4o1 Preview as Dietary Support Tools in a Real-World Medicated Obesity Program: A Prospective Comparative Analysis DOI Open Access
Louis Talay,

Leif Lagesen,

A. W. C. Yip

и другие.

Healthcare, Год журнала: 2025, Номер 13(6), С. 647 - 647

Опубликована: Март 16, 2025

Background/Objectives: Clinicians are becoming increasingly interested in the use of large language models (LLMs) obesity services. While most experts agree that LLM integration would increase access to care and its efficiency, many remain skeptical their scientific accuracy capacity convey human empathy. Recent studies have shown ChatGPT-3 capable emulating dietitian responses a range basic dietary questions. Methods: This study compared two ChatGPT-4o those from dietitians across 10 complex questions (5 broad; 5 narrow) derived patient–clinician interactions within real-world medicated digital weight loss service. Results: Investigators found neither nor Chat GPT-4o1 preview were statistically outperformed (p < 0.05) by on any study’s The same finding was made when scores aggregated ten following four individual criteria: correctness, comprehensibility, empathy/relatability, actionability. Conclusions: These results provide preliminary evidence advanced LLMs may be able play significant supporting role Research other contexts is needed before stronger conclusions about lifestyle coaching whether such initiatives access.

Язык: Английский

Процитировано

0

Big Data Versus Big GPU: Evolving Requirements and Governance Dynamics of AI Training Data DOI
Le Cheng,

Xuan Gong,

Yun Zhao

и другие.

Deleted Journal, Год журнала: 2025, Номер unknown

Опубликована: Апрель 18, 2025

Abstract Pre-trained large language models (LLMs), epitomized by ChatGPT, have leveraged a cornucopia of “big data” to attain substantial leaps in artificial intelligence (AI). Whereas the diminishing returns from pre-training and depletion available training data become evident, post-training scaling law bolstered GPU” has surfaced as an overriding strategy. Since 2024, post-trained exemplified o1 DeepSeek-R1 been widely acclaimed successes logic-intensive fields like advanced scientific problem-solving, serving bellwether for general (AGI). Driven two cardinal elements computing power task-specific datasets, processes exhibit more erratic uncontrollable tendencies, which may be menace core societal domains precipitate systemic friction vis-à-vis existing governance derived pre-trained models. At this watershed moment, article aims conduct comprehensive comparison paradigms between further develop cogent favorable responses mitigate emerging risks. Consequently, security must established prerequisite AI development, lifecycle-based framework blended can introduced metamorphosis toward “bigger models”.

Язык: Английский

Процитировано

0

Predicting surface roughness in dry machining of AISI H13 steel: a comparison of machine learning and GPT-based models with ceramic cutting tool DOI
Alex Fernandes de Souza, Filipe Alves Neto Verri,

Paulo Henrique da Silva Campos

и другие.

The International Journal of Advanced Manufacturing Technology, Год журнала: 2025, Номер unknown

Опубликована: Май 10, 2025

Язык: Английский

Процитировано

0

Evaluating the Performance of Reasoning Large Language Models on Japanese Radiology Board Examination Questions DOI
Takeshi Nakaura,

Hiroto Takamure,

Naoki Kobayashi

и другие.

Academic Radiology, Год журнала: 2025, Номер unknown

Опубликована: Май 1, 2025

Язык: Английский

Процитировано

0

Seeing the Sound: Multilingual Lip Sync for Real-Time Face-to-Face Translation DOI Creative Commons
Amirkia Rafiei Oskooei, Mehmet S. Aktaş,

Mustafa Keleş

и другие.

Computers, Год журнала: 2024, Номер 14(1), С. 7 - 7

Опубликована: Дек. 28, 2024

Imagine a future where language is no longer barrier to real-time conversations, enabling instant and lifelike communication across the globe. As cultural boundaries blur, demand for seamless multilingual has become critical technological challenge. This paper addresses lack of robust solutions face-to-face translation, particularly low-resource languages, by introducing comprehensive framework that not only translates but also replicates voice nuances synchronized facial expressions. Our research tackles primary challenge achieving accurate lip synchronization culturally diverse filling significant gap in literature evaluating generalizability sync models beyond English. Specifically, we develop novel evaluation combining quantitative error metrics qualitative assessments human observers. applied assess two state-of-the-art with different architectures Turkish, Persian, Arabic using newly collected dataset. Based on these findings, propose implement modular system integrates language-agnostic neural networks deliver fully functional translation experience. Inference Time Analysis shows this achieves highly realistic, face-translated talking heads real time, throughput as low 0.381 s. transformative primed deployment immersive environments such VR/AR, Metaverse ecosystems, advanced video conferencing platforms. It offers substantial benefits developers businesses aiming build next-generation systems applications. While work focuses three its design allows scalability additional languages. However, further testing broader linguistic contexts required confirm universal applicability, paving way more interconnected inclusive world ceases hinder connection.

Язык: Английский

Процитировано

0