Jiabo Ye

Chat Image Generator Video Music Voice Chat Photo Editor

Featured Co-authors

Wei Wang
492 publications
Fei Huang
134 publications
Jingren Zhou
90 publications
Ming Yan
79 publications
Ji Zhang
74 publications
Luo Si
67 publications
Liang He
59 publications
Chenliang Li
55 publications
Songfang Huang
45 publications
Haiyang Xu
31 publications
Xuan Wu
26 publications

research

∙ 07/04/2023

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Document understanding refers to automatically extract, analyze and comp...

0 Jiabo Ye, et al. ∙

research

∙ 06/07/2023

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks

To promote the development of Vision-Language Pre-training (VLP) and mul...

0 Haiyang Xu, et al. ∙

research

∙ 04/27/2023

mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality

Large language models (LLMs) have demonstrated impressive zero-shot abil...

0 Qinghao Ye, et al. ∙

research

∙ 02/01/2023

mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video

Recent years have witnessed a big convergence of language, vision, and m...

0 Haiyang Xu, et al. ∙

research

∙ 05/24/2022

mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections

Large-scale pretrained foundation models have been an emerging paradigm ...

0 Chenliang Li, et al. ∙

research

∙ 03/29/2022

Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding

Visual grounding focuses on establishing fine-grained alignment between ...

0 Jiabo Ye, et al. ∙

research

∙ 10/07/2021

Inferring Substitutable and Complementary Products with Knowledge-Aware Path Reasoning based on Dynamic Policy Network

Inferring the substitutable and complementary products for a given produ...

0 Zijing Yang, et al. ∙

Success!

An error occurred

Jiabo Ye

Featured Co-authors

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks

mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality

mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video

mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections

Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding

Inferring Substitutable and Complementary Products with Knowledge-Aware Path Reasoning based on Dynamic Policy Network

Sign in with Google

Consider DeepAI Pro