Soravit Changpinyo

research

∙ 05/17/2023

What You See is What You Read? Improving Text-Image Alignment Evaluation

Automatically determining whether a text and a corresponding image are s...

0 Michal Yarom, et al. ∙

research

∙ 02/23/2023

Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?

Large language models have demonstrated an emergent capability in answer...

0 Yang Chen, et al. ∙

research

∙ 02/22/2023

Connecting Vision and Language with Video Localized Narratives

We propose Video Localized Narratives, a new form of multimodal video an...

0 Paul Voigtlaender, et al. ∙

research

∙ 12/19/2022

MetaCLUE: Towards Comprehensive Visual Metaphors Research

Creativity is an indispensable part of human cognition and also an inher...

0 Arjun R Akula, et al. ∙

research

∙ 09/14/2022

PaLI: A Jointly-Scaled Multilingual Language-Image Model

Effective scaling and a flexible task interface enable large language mo...

6 Xi Chen, et al. ∙

research

∙ 09/12/2022

PreSTU: Pre-Training for Scene-Text Understanding

The ability to read and reason about texts in an image is often lacking ...

0 Jihyung Kil, et al. ∙

research

∙ 09/12/2022

Towards Multi-Lingual Visual Question Answering

Visual Question Answering (VQA) has been primarily studied through the l...

0 Soravit Changpinyo, et al. ∙

research

∙ 05/04/2022

All You May Need for VQA are Image Captions

Visual Question Answering (VQA) has benefited from increasingly sophisti...

0 Soravit Changpinyo, et al. ∙

research

∙ 02/17/2021

Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts

The availability of large-scale image captioning and visual question ans...

0 Soravit Changpinyo, et al. ∙

research

∙ 02/17/2021

A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection

Object frequencies in daily scenes follow a long-tailed distribution. Ma...

14 Cheng Zhang, et al. ∙

research

∙ 02/09/2021

Telling the What while Pointing the Where: Fine-grained Mouse Trace and Language Supervision for Improved Image Retrieval

Existing image retrieval systems use text queries to provide a natural a...

0 Soravit Changpinyo, et al. ∙

research

∙ 09/10/2020

Weakly Supervised Content Selection for Improved Image Captioning

Image captioning involves identifying semantic concepts in the scene and...

3 Khyathi Raghavi Chandu, et al. ∙

research

∙ 12/06/2019

Connecting Vision and Language with Localized Narratives

We propose Localized Narratives, an efficient way to collect image capti...

16 Jordi Pont-Tuset, et al. ∙

research

∙ 09/04/2019

Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering

Object detection plays an important role in current solutions to vision ...

0 Soravit Changpinyo, et al. ∙

research

∙ 12/16/2018

Classifier and Exemplar Synthesis for Zero-Shot Learning

Zero-shot learning (ZSL) enables solving a task without the need to see ...

0 Soravit Changpinyo, et al. ∙

research

∙ 08/13/2018

Multi-Task Learning for Sequence Tagging: An Empirical Study

We study three general multi-task learning (MTL) approaches on 11 sequen...

0 Soravit Changpinyo, et al. ∙

research

∙ 02/21/2017

The Power of Sparsity in Convolutional Neural Networks

Deep convolutional networks are well-known for their high computational ...

0 Soravit Changpinyo, et al. ∙

research

∙ 05/26/2016

Predicting Visual Exemplars of Unseen Classes for Zero-Shot Learning

Leveraging class semantic descriptions and examples of known objects, ze...

0 Soravit Changpinyo, et al. ∙

research

∙ 03/02/2016

Synthesized Classifiers for Zero-Shot Learning

Given semantic descriptions of object classes, zero-shot learning aims t...

0 Soravit Changpinyo, et al. ∙

Soravit Changpinyo

Featured Co-authors

Sign in with Google

Consider DeepAI Pro