Conférence: Multimodal Learning: Integrating Vision and Language
What is Multimodal Machine Learning and how far are we in bringing vision and language together?

We will introduce the concept of multimodality and focus, in the talk, on two modalities, vision and language, to present the (state of the art) deep neural models for processing (multimodal) vision and language inputs.
Multimodal Machine Learning develops models for different input modalities, e.g. different sensorial inputs, like images, text, speech, humidity, pressure.
In this talk, we will introduce the concept of multimodality and focus, in the talk, on two modalities, vision and language, to present the (state of the art) deep neural models for processing (multimodal) vision and language inputs.
Info
Jour:
2020-11-19
Début:
19:45
Durée:
00:30
Salle:
Clotilde Tambroni
Fil:
Computational Linguistics
Langue:
en
Liens:
Commentaires
Cliquez ici pour nous faire savoir que vous avez aimé cet évènement.
Événements concomitants
Orateurs
![]() |
Letitia Parcalabescu |