b'Idan Schwartz'

research

∙ 05/22/2023

AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation

In recent years, image generation has shown a great leap in performance,...

0 Guy Yariv, et al. ∙

research

∙ 03/30/2023

Discriminative Class Tokens for Text-to-Image Diffusion Models

Recent advances in text-to-image diffusion models have enabled the gener...

9 Idan Schwartz, et al. ∙

research

∙ 10/21/2022

Describing Sets of Images with Textual-PCA

We seek to semantically describe a set of images, capturing both the att...

0 Oded Hupert, et al. ∙

research

∙ 07/22/2022

Zero-Shot Video Captioning with Evolving Pseudo-Tokens

We introduce a zero-shot video captioning method that employs two frozen...

5 Yoad Tewel, et al. ∙

research

∙ 06/02/2022

Optimizing Relevance Maps of Vision Transformers Improves Robustness

It has been observed that visual classification models often rely mostly...

0 Hila Chefer, et al. ∙

research

∙ 12/09/2021

Latent Space Explanation by Intervention

The success of deep neural nets heavily relies on their ability to encod...

0 Itai Gat, et al. ∙

research

∙ 11/29/2021

Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic

Recent text-to-image matching models apply contrastive learning to large...

0 Yoad Tewel, et al. ∙

research

∙ 10/27/2021

Perceptual Score: What Data Modalities Does Your Model Perceive?

Machine learning advances in the last decade have relied significantly o...

0 Itai Gat, et al. ∙

research

∙ 10/21/2021

Video and Text Matching with Conditioned Embeddings

We present a method for matching a text sentence from a given corpus to ...

0 Ameen Ali, et al. ∙

research

∙ 08/04/2021

Towards Coherent Visual Storytelling with Ordered Image Attention

We address the problem of visual storytelling, i.e., generating a story ...

0 Tom Braude, et al. ∙

research

∙ 04/15/2021

Ensemble of MRR and NDCG models for Visual Dialog

Assessing an AI agent that can converse in human language and understand...

0 Idan Schwartz, et al. ∙

research

∙ 10/21/2020

Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies

Many recent datasets contain a variety of different data modalities, for...

0 Itai Gat, et al. ∙

research

∙ 04/11/2019

Factor Graph Attention

Dialog is an effective way to exchange information, but subtle details a...

0 Idan Schwartz, et al. ∙

research

∙ 04/11/2019

A Simple Baseline for Audio-Visual Scene-Aware Dialog

The recently proposed audio-visual scene-aware dialog task paves the way...

0 Idan Schwartz, et al. ∙

research

∙ 11/12/2017

High-Order Attention Models for Visual Question Answering

The quest for algorithms that enable cognitive abilities is an important...

0 Idan Schwartz, et al. ∙

Idan Schwartz

Featured Co-authors

Sign in with Google

Consider DeepAI Pro