A. Sophia Koepke

research

∙ 09/07/2023

Text-to-feature diffusion for audio-visual few-shot learning

Training deep learning models for video classification from audio-visual...

0 Otniel-Bogdan Mercea, et al. ∙

research

∙ 08/21/2023

Image-free Classifier Injection for Zero-Shot Classification

Zero-shot learning models achieve remarkable results on image classifica...

0 Anders Christensen, et al. ∙

research

∙ 07/20/2023

Addressing caveats of neural persistence with deep graph persistence

Neural Persistence is a prominent measure for quantifying neural network...

0 Leander Girrbach, et al. ∙

research

∙ 06/12/2023

Waffling around for Performance: Visual Classification with Random Words and Broad Concepts

The visual classification performance of vision-language models such as ...

0 Karsten Roth, et al. ∙

research

∙ 04/06/2023

Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval

Cross-modal retrieval methods are the preferred tool to search databases...

0 Jae Myung Kim, et al. ∙

research

∙ 10/25/2022

PlanT: Explainable Planning Transformers via Object-Level Representations

Planning an optimal route in a complex environment requires efficient re...

0 Katrin Renz, et al. ∙

research

∙ 07/20/2022

Temporal and cross-modal attention for audio-visual zero-shot learning

Audio-visual generalised zero-shot learning for video classification req...

5 Otniel-Bogdan Mercea, et al. ∙

research

∙ 04/05/2022

CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations

Providing explanations in the context of Visual Question Answering (VQA)...

21 Leonard Salewski, et al. ∙

research

∙ 03/07/2022

Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and Language

Learning to classify video data from classes not included in the trainin...

7 Otniel-Bogdan Mercea, et al. ∙

research

∙ 12/17/2021

Audio Retrieval with Natural Language Queries: A Benchmark Study

The objectives of this work are cross-modal text-audio and audio-text re...

0 A. Sophia Koepke, et al. ∙

research

∙ 05/05/2021

Audio Retrieval with Natural Language Queries

We consider the task of retrieving audio using free-form natural languag...

0 Andreea-Maria Oncescu, et al. ∙

research

∙ 05/04/2021

Where and When: Space-Time Attention for Audio-Visual Explanations

Explaining the decision of a multi-modal decision-maker requires to dete...

10 Yanbei Chen, et al. ∙

research

∙ 04/22/2021

Distilling Audio-Visual Knowledge by Compositional Contrastive Learning

Having access to multi-modal cues (e.g. vision and audio) empowers some ...

8 Yanbei Chen, et al. ∙

research

∙ 10/28/2019

Self-supervised learning of class embeddings from video

This work explores how to use self-supervised learning on videos to lear...

41 Olivia Wiles, et al. ∙

research

∙ 08/21/2018

Self-supervised learning of a facial attribute embedding from video

We propose a self-supervised framework for learning facial attributes by...

4 Olivia Wiles, et al. ∙

research

∙ 07/27/2018

X2Face: A network for controlling face generation by using images, audio, and pose codes

The objective of this paper is a neural network model that controls the ...

4 Olivia Wiles, et al. ∙

A. Sophia Koepke

Featured Co-authors

Sign in with Google

Consider DeepAI Pro