Multimodal Neural Machine Translation: A Survey of the State of the Art

Yi Feng, Chuanyi Li, Jiatong He, Zhenyu Hou, and Vincent Ng.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025.

Click here for the PDF version.

Abstract

Multimodal neural machine translation (MNMT), has received increasing attention due to its widespread applications in various fields such as cross-border e-commerce and cross-border social media platforms. The task which aims to integrate other modalities, such as the visual modality, with textual data to enhance translation performance, has recently received a lot of attention. We survey the major milestones in MNMT research, discussing key challenges and promising research directions for advancing MNMT.

BibTeX entry

@InProceedings{Feng+etal:25b,
  author = {Yi Feng and Chuanyi Li and Jiatong He and Zhenyu Hou and Vincent Ng},
  title = {Multi-Modal Neural Machine Translation: A Survey of the State of the Art},
  booktitle = {Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing},

  year = 2025}