Partial Label Learning (PLL) is a type of weakly supervised learning whe...
Medical image analysis using deep learning is often challenged by limite...
Pancreatic ductal adenocarcinoma (PDAC) is a highly lethal cancer in whi...
Liver tumor segmentation and classification are important tasks in compu...
Advances in deep learning have greatly improved structure prediction of
...
A key puzzle in search, ads, and recommendation is that the ranking mode...
The convergence of text, visual, and audio data is a key step towards
hu...
Real-world medical image segmentation has tremendous long-tailed complex...
Code-switching speech refers to a means of expression by mixing two or m...
Partial label learning (PLL) is a typical weakly supervised learning pro...
Knowledge distillation has been widely adopted in a variety of tasks and...
This paper presents a new method for end-to-end Video Question Answering...
Human readers or radiologists routinely perform full-body multi-organ
mu...
Lymph node (LN) metastasis status is one of the most critical prognostic...
Several trade-offs need to be balanced when employing monaural speech
se...
The accurate protein-ligand binding affinity prediction is essential in ...
Recent years have witnessed significant success in Gradient Boosting Dec...
The deep convolutional neural networks (CNNs) using attention mechanism ...
Video question answering (VideoQA) is challenging given its multimodal
c...
Human intelligence is multimodal; we integrate visual, linguistic, and
a...
This technical note describes the recent updates of Graphormer, includin...
This technical note describes the recent updates of Graphormer, includin...
The sparsely-gated Mixture of Experts (MoE) can magnify a network capaci...
Automated visual understanding of our diverse and open world demands com...
The advances in attention-based encoder-decoder (AED) networks have brou...
Video question answering (VideoQA) is challenging given its multimodal
c...
Spoken Language Understanding (SLU) is composed of two subtasks: intent
...
The Transformer model is widely used in natural language processing for
...
Modern Automatic Speech Recognition (ASR) systems can achieve high
perfo...
Recently, universal neural machine translation (NMT) with shared
encoder...
End-to-end (E2E) spoken language understanding (SLU) can infer semantics...
In this study, we try to address the problem of leveraging visual signal...
Cross-lingual Summarization (CLS) aims at producing a summary in the tar...
Span extraction is an essential problem in machine reading comprehension...
Analysis of chemical graphs is a major research topic in computational
m...
As a spontaneous expression of emotion on face, micro-expression is rece...
Given a large number of online services on the Internet, from time to ti...
Pancreatic ductal adenocarcinoma (PDAC) is one of the most lethal cancer...
Accurate and automated tumor segmentation is highly desired since it has...
Utilizing computed tomography (CT) images to quickly estimate the severi...
Modern Automatic Speech Recognition (ASR) systems can achieve high
perfo...
In this work, we propose to study the utility of different meta-graphs, ...
Text-rich heterogeneous information networks (text-rich HINs) are ubiqui...
The automatic recognition of micro-expression has been boosted ever sinc...
As one type of complex networks widely-seen in real-world application,
h...
Heterogeneous information networks (HINs) are ubiquitous in real-world
a...
Recently, integrated optics has gained interest as a hardware platform f...
Heterogeneous information networks (HINs) are ubiquitous in real-world
a...
Gradient boosting using decision trees as base learners, so called Gradi...
Multi-view networks are ubiquitous in real-world applications. In order ...