Adaptive Transformers

We discuss numerous areas of research in NLP focussed upon, making deep learning models lighter, reducing computation (FLOPS), inducing sparsity, and other related methods. Specifically, we discuss, including Pruning, Quantization, Distillation, and Adaptive Methods. We discuss the following papers in detail: - Adaptive Attention Span () - Adaptively Sparse Transformers () - Reducing Transformer Depth on Demand with Structured Dropout (https://
Back to Top