Video question answering with limited supervision

Defense type
Thesis
Starting date
End date
Location
IRISA Rennes
Room
Aurigny
Speaker
Deniz ENGIN (Linkmedia)
Theme

Video content has significantly increased in volume and diversity in the digital era, and this expansion has highlighted the necessity for advanced video understanding technologies. Driven by this necessity, this thesis explores semantically understanding videos, leveraging multiple perceptual modes similar to human cognitive processes and efficient learning with limited supervision similar to human learning capabilities. This thesis specifically focuses on video question answering as one of the main video understanding tasks. Our first contribution addresses long-range video question answering, requiring an understanding of extended video content. While recent approaches rely on human-generated external sources, we process raw data to generate video summaries. Our following contribution explores zero-shot and few-shot video question answering, aiming to enhance efficient learning from limited data. We leverage the knowledge of existing large-scale models by eliminating challenges in adapting pre-trained models to limited data. We demonstrate that these contributions significantly enhance the capabilities of multimodal video question-answering systems, where specifically human-annotated labeled data is limited or unavailable.

[ATTENTION dans le cadre du plan VIGIPIRATE : l’accès du public à cette soutenance est contraint à une inscription préalable obligatoire auprès de lydie [*] mabilatinria [*] fr (aurelie[dot]patier[at]inria[dot]fr). L’accès ne sera pas autorisé sans inscription préalable. Par ailleurs, les visiteurs ne porteront ni bagage ni sac.]

Composition of the jury
Yannis AVRITHIS - Principal Investigator, IARAI - Co-dir. de thèse
Luce MORIN - Professor, INSA Rennes - Président
Josef SIVIC - Distinguished Researcher, Czech Technical University - Examinateur
Karteek ALAHARI - Directeur de recherche, Inria Grenoble - Examinateur
Ivan LAPTEV - Visiting Professor, MBZUAI - Rapporteur
Matthieu CORD - Professor, Sorbonne University - Rapporteur