HAT: Hardware-Aware Transformers for Efficient Natural Language Processing

ACL 2020 Hanrui WangZhanghao WuZhijian LiuHan CaiLigeng ZhuChuang GanSong Han

Transformers are ubiquitous in Natural Language Processing (NLP) tasks, but they are difficult to be deployed on hardware due to the intensive computation. To enable low-latency inference on resource-constrained hardware platforms, we propose to design Hardware-Aware Transformers (HAT) with neural architecture search... (read more)

PDF Abstract ACL 2020 PDF ACL 2020 Abstract

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper