Search Results for author: Artem Chumachenko

Found 3 papers, 1 papers with code

Distributed Inference and Fine-tuning of Large Language Models Over The Internet

no code implementations NeurIPS 2023 Alexander Borzunov, Max Ryabinin, Artem Chumachenko, Dmitry Baranchuk, Tim Dettmers, Younes Belkada, Pavel Samygin, Colin Raffel

Large language models (LLMs) are useful in many NLP tasks and become more capable with size, with the best open-source models having over 50 billion parameters.

Petals: Collaborative Inference and Fine-tuning of Large Models

1 code implementation2 Sep 2022 Alexander Borzunov, Dmitry Baranchuk, Tim Dettmers, Max Ryabinin, Younes Belkada, Artem Chumachenko, Pavel Samygin, Colin Raffel

However, these techniques have innate limitations: offloading is too slow for interactive inference, while APIs are not flexible enough for research that requires access to weights, attention or logits.

Collaborative Inference

Weight Squeezing: Reparameterization for Knowledge Transfer and Model Compression

no code implementations14 Oct 2020 Artem Chumachenko, Daniil Gavrilov, Nikita Balagansky, Pavel Kalaidin

We also proposed a variant of Weight Squeezing called Gated Weight Squeezing, for which we combined fine-tuning of BERT-Medium model and learning mapping from BERT-Base weights.

General Classification Model Compression +3

Cannot find the paper you are looking for? You can Submit a new open access paper.