tinyML Asia 2021 Dongsoo Lee: Extremely low-bit quantization for Transformers
tinyML Asia 2021
Extremely low-bit quantization for Transformers
DongSoo LEE 이동수, Executive Officer, NAVER CLOVA
The deployment of widely used Transformer architecture is challenging because of heavy computation load and memory overhead during inference, especially when the target device is limited in computational resources such as mobile or edge devices. Quantization is an effective technique to address such challenges. Our analysis shows that for a given number of quantization bits, each block of Transformer contributes to model accuracy and inference computations in different manners. Moreover, even inside an embedding block, each word presents vastly different contributions. Correspondingly, we propose a mixed precision quantization strategy to represent Transformer weights by an extremely low number of bits (e.g., under 3 bits). For example, for each word in an embedding block, we assign different quantization bits based on statistical property. We also introduce a new
1 view
32
12
3 years ago 00:14:56 2
tinyML Asia 2021 Partner Session - SynSense: SPECK– A Low power, low latency neuromorphic visual...
3 years ago 00:13:25 7
tinyML Asia 2021 Video Poster: Plant Growth and LAI Estimation using quantized Embedded Regression..
3 years ago 00:14:38 2
tinyML Asia 2021 Video Poster: Efficient inference of low-resolution optic flow on low power...
3 years ago 00:28:21 2
tinyML Asia 2021 Justin Kao: A lightweight face detection method working with Himax Ultra-Low...
3 years ago 00:16:40 13
tinyML Asia 2021 Zou Yuanhao: TinyML Heat Image Face Recognition on Wio-Terminal
3 years ago 00:23:01 1
tinyML Asia 2021 Haochen Xie: An approach to dynamically integrate heterogenous AI components...
3 years ago 00:29:39 1
tinyML Asia 2021 Joshua Chang: Sensor Fusion using Machine Learning: Smart Forehead Temperature...
3 years ago 00:08:08 5
tinyML Asia 2021 Video Poster: Cyberon DSpotter: A phoneme-based local voice recognition solution
3 years ago 00:08:10 1
tinyML Asia 2021 Video Poster: AI Enabled Low-Cost Stethoscope
3 years ago 00:08:02 3
tinyML Asia 2021 Video Poster: Bird Hotspots: A tinyML acoustic classification system for...
3 years ago 00:21:24 1
tinyML Asia 2021 Anton Kroger: Airborne sound maintenance in remote sites using low power...