Abstract: The goal of the Image Text Multimodal Named Entity Recognition (MNER) task is to identify and classify entities such as individuals, organisations, and places across text and image ...