Byeongwook Kim

research

∙ 10/08/2022

AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models

There are growing interests in adapting large-scale language models usin...

8 Se Jung Kwon, et al. ∙

research

∙ 06/20/2022

nuQmm: Quantized MatMul for Efficient Inference of Large-Scale Generative Language Models

The recent advance of self-supervised learning associated with the Trans...

0 Gunho Park, et al. ∙

research

∙ 05/05/2021

Modulating Regularization Frequency for Efficient Compression-Aware Model Training

While model compression is increasingly important because of large neura...

4 Dongsoo Lee, et al. ∙

research

∙ 05/05/2021

Sequential Encryption of Sparse Neural Networks Toward Optimum Representation of Irregular Sparsity

Even though fine-grained pruning techniques achieve a high compression r...

0 Baeseong Park, et al. ∙

research

∙ 05/05/2021

Q-Rater: Non-Convex Optimization for Post-Training Uniform Quantization

Various post-training uniform quantization methods have usually been stu...

0 Byeongwook Kim, et al. ∙

research

∙ 09/16/2020

Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation

Transformer is being widely used in Neural Machine Translation (NMT). De...

0 Insoo Chung, et al. ∙

research

∙ 09/09/2020

FleXOR: Trainable Fractional Quantization

Quantization based on the binary codes is gaining attention because each...

0 Dongsoo Lee, et al. ∙

research

∙ 05/20/2020

BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based Quantized DNNs

The number of parameters in deep neural networks (DNNs) is rapidly incre...

14 Yongkweon Jeon, et al. ∙

research

∙ 05/24/2019

Learning Low-Rank Approximation for CNNs

Low-rank approximation is an effective model compression technique to no...

0 Dongsoo Lee, et al. ∙

research

∙ 05/24/2019

Structured Compression by Unstructured Pruning for Sparse Quantized Neural Networks

Model compression techniques, such as pruning and quantization, are beco...

0 Se Jung Kwon, et al. ∙

research

∙ 05/14/2019

Network Pruning for Low-Rank Binary Indexing

Pruning is an efficient model compression technique to remove redundancy...

0 Dongsoo Lee, et al. ∙

research

∙ 10/30/2018

DeepTwist: Learning Model Compression via Occasional Weight Distortion

Model compression has been introduced to reduce the required hardware re...

0 Dongsoo Lee, et al. ∙

research

∙ 05/29/2018

Retraining-Based Iterative Weight Quantization for Deep Neural Networks

Model compression has gained a lot of attention due to its ability to re...

0 Dongsoo Lee, et al. ∙

Byeongwook Kim

Featured Co-authors

Sign in with Google

Consider DeepAI Pro