Text-to-image generation (TTI) refers to the usage of models that could
...
ChatGPT-like models have revolutionized various applications in artifici...
In the complex domain of large language models (LLMs), striking a balanc...
This study examines the impact of optimizing the Stable Diffusion (SD) g...
Post-training quantization () had been recently shown as a compromising
...
The field of natural language processing (NLP) has made significant stri...
Improving the deployment efficiency of transformer-based language models...
Recent advances on deep learning models come at the price of formidable
...
Large-scale transformer models have become the de-facto architectures fo...
Graph Neural Networks (GNNs) is a promising approach for applications wi...
How to efficiently serve ever-larger trained natural language models in
...
Extreme compression, particularly ultra-low bit precision (binary/ternar...
As the training of giant dense models hits the boundary on the availabil...
We demonstrate that, hidden within one-layer randomly weighted neural
ne...
Most existing Vision-and-Language (V L) models rely on pre-trained vis...
Pruning is an effective method to reduce the memory footprint and
comput...
The increasing size of neural network models has been critical for
impro...
End-to-end neural network models achieve improved performance on various...
As soon as abstract mathematical computations were adapted to computatio...
Pruning is an effective method to reduce the memory footprint and FLOPs
...
Transformer based models, like BERT and RoBERTa, have achieved
state-of-...
Quantization is one of the key techniques used to make Neural Networks (...
Fully quantized training (FQT), which uses low-bitwidth hardware by
quan...
Phrase localization is a task that studies the mapping from textual phra...
Federated learning promises to use the computational power of edge devic...
We introduce AdaHessian, a second order stochastic optimization algorith...
The standard normalization method for neural network (NN) models used in...
Quantization is a promising approach for reducing the inference time and...
We present PyHessian, a new scalable framework that enables fast computa...
Quantization is an effective method for reducing memory footprint and
in...
Transformer based architectures have become de-facto models used for a r...
It has been observed that residual networks can be viewed as the explici...
We regard pre-trained residual networks (ResNets) as nonlinear systems a...
It has been demonstrated that very simple attacks can fool
highly-sophis...
In stochastic optimization, large batch training can leverage parallel
r...
In many applications, it is important to reconstruct a fluid flow field,...
Deep Neural Networks are quite vulnerable to adversarial perturbations.
...
Optimal parameter initialization remains a crucial problem for neural ne...
Increasing the mini-batch size for stochastic gradient descent offers
si...
Stochastic Gradient Descent (SGD) methods using randomly selected batche...
Large batch size training of Neural Networks has been shown to incur acc...