Search Results for author: Acyr Locatelli

Found 3 papers, 2 papers with code

SnapKV: LLM Knows What You are Looking for Before Generation

1 code implementation22 Apr 2024 Yuhong Li, Yingbing Huang, Bowen Yang, Bharat Venkitesh, Acyr Locatelli, Hanchen Ye, Tianle Cai, Patrick Lewis, Deming Chen

Specifically, SnapKV achieves a consistent decoding speed with a 3. 6x increase in generation speed and an 8. 2x enhancement in memory efficiency compared to baseline when processing inputs of 16K tokens.

16k

Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning

1 code implementation11 Sep 2023 Ted Zadouri, Ahmet Üstün, Arash Ahmadian, Beyza Ermiş, Acyr Locatelli, Sara Hooker

The Mixture of Experts (MoE) is a widely known neural architecture where an ensemble of specialized sub-models optimizes overall performance with a constant computational cost.

Exploring Low Rank Training of Deep Neural Networks

no code implementations27 Sep 2022 Siddhartha Rao Kamalakara, Acyr Locatelli, Bharat Venkitesh, Jimmy Ba, Yarin Gal, Aidan N. Gomez

Training deep neural networks in low rank, i. e. with factorised layers, is of particular interest to the community: it offers efficiency over unfactorised training in terms of both memory consumption and training time.

Cannot find the paper you are looking for? You can Submit a new open access paper.