Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets.
We propose an alternative generator architecture for generative adversarial networks, borrowing from style transfer literature.
Transformer networks have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling.
SOTA for Language Modelling on Hutter Prize
The AlphaGo, AlphaGo Zero, and AlphaZero series of algorithms are a remarkable demonstration of deep reinforcement learning's capabilities, achieving superhuman performance in the complex game of Go with progressively increasing autonomy.
On the negative side, we observe in our study that well-disentangled models seemingly cannot be identified without access to ground-truth labels even if we are allowed to transfer hyperparameters across data sets.