Опубликована: Янв. 1, 2024
Язык: Английский
Опубликована: Янв. 1, 2024
Язык: Английский
ACM Computing Surveys, Год журнала: 2025, Номер unknown
Опубликована: Янв. 18, 2025
The multimodal interplay of the five fundamental senses—Sight, Hearing, Smell, Taste, and Touch—provides humans with superior environmental perception learning skills. Adapted from human perceptual system, machine tries to incorporate different forms input, such as image, audio, text, determine their connections through joint modeling. As one future development artificial intelligence, it is necessary summarize progress learning. In this paper, we start form a combination provide comprehensive survey emerging subject learning, covering representative research approaches, most recent advancements, applications. Specifically, paper analyzes relationship between modalities in detail sorts out key issues application scenarios. Besides, thoroughly reviewed state-of-the-art methods datasets covered research. We then identify substantial challenges potential developing directions field. Finally, given nature survey, both modality-specific task-specific researchers can benefit advance
Язык: Английский
Процитировано
1Expert Systems with Applications, Год журнала: 2025, Номер unknown, С. 127360 - 127360
Опубликована: Март 1, 2025
Язык: Английский
Процитировано
0Опубликована: Янв. 1, 2024
Язык: Английский
Процитировано
0Опубликована: Янв. 1, 2024
Язык: Английский
Процитировано
0