In this work, we propose a new transformer-based regularization to bette...
We present TransNormerLLM, the first linear attention-based Large Langua...
Length extrapolation has attracted considerable attention recently since...
Relative positional encoding is widely used in vanilla and linear
transf...
Cloud providers can greatly benefit from accurate workload prediction.
H...
The reconstruction of quantum states from experimental measurements, oft...
Graded labels are ubiquitous in real-world learning-to-rank applications...
The distillation of ranking models has become an important topic in both...
Sequence modeling has important applications in natural language process...
Query expansion is a widely used technique to improve the recall of sear...
We explore a new task for audio-visual-language modeling called fine-gra...
Domain adaptation aims to transfer the knowledge acquired by models trai...
As Learning-to-Rank (LTR) approaches primarily seek to improve ranking
q...
Linear transformers aim to reduce the quadratic space-time complexity of...
Vision Transformers have achieved impressive performance in video
classi...
Recently, substantial progress has been made in text ranking based on
pr...
Retrieval augmentation has shown promising improvements in different tas...
In this paper, we study the problem of recovering a low-rank matrix from...
Recently, numerous efficient Transformers have been proposed to reduce t...
Tensor train decomposition is widely used in machine learning and quantu...
Vision transformers have shown great success on numerous computer vision...
Federated learning (FL) is a promising technical support to the vision o...
State-of-the-art neural models typically encode document-query pairs usi...
Transformer has shown great successes in natural language processing,
co...
Multiclass classification (MCC) is a fundamental machine learning proble...
We introduce Born Again neural Rankers (BAR) in the Learning to Rank (LT...
State-of-the-art models in natural language processing rely on separate ...
In the era of pre-trained language models, Transformers are the de facto...
This paper proposes Omnidirectional Representations from Transformers
(O...
By pushing computation, cache, and network control to the edge, mobile e...
The need for medical image encryption is increasingly pronounced, for ex...
The LSTM network was proposed to overcome the difficulty in learning
lon...
Piecewise Aggregate Approximation (PAA) is a competitive basic dimension...
User information needs vary significantly across different tasks, and
th...