Search Results for author: Wojciech Mańke

Found 2 papers, 1 papers with code

Searching for Efficient Transformers for Language Modeling

no code implementations NeurIPS 2021 David So, Wojciech Mańke, Hanxiao Liu, Zihang Dai, Noam Shazeer, Quoc Le

For example, at a 500M parameter size, Primer improves the original T5 architecture on C4 auto-regressive language modeling, reducing the training cost by 4X.

Language Modelling

Primer: Searching for Efficient Transformers for Language Modeling

4 code implementations17 Sep 2021 David R. So, Wojciech Mańke, Hanxiao Liu, Zihang Dai, Noam Shazeer, Quoc V. Le

For example, at a 500M parameter size, Primer improves the original T5 architecture on C4 auto-regressive language modeling, reducing the training cost by 4X.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.