Quantization

1039 papers with code • 10 benchmarks • 18 datasets

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Benchmarks

Add a Result

These leaderboards are used to track progress in Quantization

Dataset	Best Model	Compare
ImageNet	FQ-ViT (ViT-L)	See all
CIFAR-10	3DCNN_VIVA_3	See all
Knowledge-based:	3DCNN_VIVA_5	See all
MS COCO	SSD ResNet50 V1 FPN 640x640	See all
LFW		See all
CFP-FP		See all
AgeDB-30		See all
IJB-C		See all
IJB-B		See all
Wiki-40B	OutEffHop-Bert_base	See all

Libraries

Use these libraries to find Quantization models and implementations

microsoft/DeepSpeed

8 papers

32,658

faceonlive/ai-research

5 papers

152

UCMerced-ML/LC-model-compression

5 papers

huggingface/transformers

4 papers

124,984

See all 5 libraries.

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

FastText.zip: Compressing text classification models

facebookresearch/fastText • 12 Dec 2016

We consider the problem of producing compact architectures for text classification, such that the full model fits in a limited amount of memory.

Paper
Code

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

tensorflow/models • • CVPR 2018

The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes.

Paper
Code

wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations

pytorch/fairseq • • NeurIPS 2020

We show for the first time that learning powerful representations from speech audio alone followed by fine-tuning on transcribed speech can outperform the best semi-supervised methods while being conceptually simpler.

Paper
Code

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

NervanaSystems/distiller • • 1 Oct 2015

To address this limitation, we introduce "deep compression", a three stage pipeline: pruning, trained quantization and Huffman coding, that work together to reduce the storage requirement of neural networks by 35x to 49x without affecting their accuracy.

Paper
Code

DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients

tensorpack/tensorpack • • 20 Jun 2016

We propose DoReFa-Net, a method to train convolutional neural networks that have low bitwidth weights and activations using low bitwidth parameter gradients.

Paper
Code

Billion-scale similarity search with GPUs

facebookresearch/faiss • • 28 Feb 2017

Similarity search finds application in specialized database systems handling complex data such as images or videos, which are typically represented by high-dimensional features and require specific indexing structures.

Paper
Code

QLoRA: Efficient Finetuning of Quantized LLMs

artidoro/qlora • • NeurIPS 2023

Our best model family, which we name Guanaco, outperforms all previous openly released models on the Vicuna benchmark, reaching 99. 3% of the performance level of ChatGPT while only requiring 24 hours of finetuning on a single GPU.

Paper
Code

HAQ: Hardware-Aware Automated Quantization with Mixed Precision

mit-han-lab/once-for-all • • CVPR 2019

Compared with conventional methods, our framework is fully automated and can specialize the quantization policy for different neural network architectures and hardware architectures.

Paper
Code

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

ist-daslab/gptq • • 31 Oct 2022

In this paper, we address this challenge, and propose GPTQ, a new one-shot weight quantization method based on approximate second-order information, that is both highly-accurate and highly-efficient.

Paper
Code

GLM-130B: An Open Bilingual Pre-trained Model

thudm/glm-130b • • 5 Oct 2022

We introduce GLM-130B, a bilingual (English and Chinese) pre-trained language model with 130 billion parameters.

Paper
Code

Quantization

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result