Search Results for author: DeepSeek-AI

Found 2 papers, 2 papers with code

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

1 code implementation7 May 2024 DeepSeek-AI

MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation.

Language Modelling Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.