Games are abstractions of the real world, where artificial agents learn to compete and cooperate with other agents.
Recently, research efforts have been concentrated on revealing how pre-trained model makes a difference in neural network performance.
We propose a "transposed" version of self-attention that operates across feature channels rather than tokens, where the interactions are based on the cross-covariance matrix between keys and queries.
Ranked #3 on Self-Supervised Image Classification on ImageNet
While several recent works investigate how to disentangle underlying factors of variation in the data, most of them operate in 2D and hence ignore that our world is three-dimensional.
This adversarial loss guarantees the map is diverse -- a very wide range of anime can be produced from a single content code.
Ranked #1 on Image-to-Image Translation on selfie2anime
Further analysis shows that Lattice-BERT can harness the lattice structures, and the improvement comes from the exploration of redundant information and multi-granularity representations.
An extensive set of experiments show that PALM achieves new state-of-the-art results on a variety of language generation benchmarks covering generative question answering (Rank 1 on the official MARCO leaderboard), abstractive summarization on CNN/DailyMail as well as Gigaword, question generation on SQuAD, and conversational response generation on Cornell Movie Dialogues.
Extensive experiments on the large-scale SQuAD and TriviaQA datasets validate the effectiveness of the proposed method.
Our key insight to utilizing Transformer in the graph is the necessity of effectively encoding the structural information of a graph into the model.
Ranked #1 on Graph Regression on PCQM4M-LSC