Search Results for author: Po-Sen Huang

Found 35 papers, 13 papers with code

The DeepMind Chinese–English Document Translation System at WMT2020

no code implementations • WMT (EMNLP) 2020 • Lei Yu, Laurent Sartran, Po-Sen Huang, Wojciech Stokowiec, Domenic Donato, Srivatsan Srinivasan, Alek Andreev, Wang Ling, Sona Mokra, Agustin Dal Lago, Yotam Doron, Susannah Young, Phil Blunsom, Chris Dyer

This paper describes the DeepMind submission to the Chinese\rightarrowEnglish constrained data track of the WMT2020 Shared Task on News Translation.

Document Translation Sentence +2

Paper
Add Code

Consensus, dissensus and synergy between clinicians and specialist foundation models in radiology report generation

no code implementations • 30 Nov 2023 • Ryutaro Tanno, David G. T. Barrett, Andrew Sellergren, Sumedh Ghaisas, Sumanth Dathathri, Abigail See, Johannes Welbl, Karan Singhal, Shekoofeh Azizi, Tao Tu, Mike Schaekermann, Rhys May, Roy Lee, SiWai Man, Zahra Ahmed, Sara Mahdavi, Yossi Matias, Joelle Barral, Ali Eslami, Danielle Belgrave, Vivek Natarajan, Shravya Shetty, Pushmeet Kohli, Po-Sen Huang, Alan Karthikesalingam, Ira Ktena

Radiology reports are an instrumental part of modern medicine, informing key clinical decisions such as diagnosis and treatment.

Paper
Add Code

Optimizing Memory Mapping Using Deep Reinforcement Learning

no code implementations • 11 May 2023 • Pengming Wang, Mikita Sazanovich, Berkin Ilbeyi, Phitchaya Mangpo Phothilimthana, Manish Purohit, Han Yang Tay, Ngân Vũ, Miaosen Wang, Cosmin Paduraru, Edouard Leurent, Anton Zhernov, Po-Sen Huang, Julian Schrittwieser, Thomas Hubert, Robert Tung, Paula Kurylowicz, Kieran Milan, Oriol Vinyals, Daniel J. Mankowitz

We also introduce a Reinforcement Learning agent, mallocMuZero, and show that it is capable of playing this game to discover new and improved memory mapping solutions that lead to faster execution times on real ML workloads on ML accelerators.

Cloud Computing Decision Making +3

Paper
Add Code

Improving alignment of dialogue agents via targeted human judgements

no code implementations • 28 Sep 2022 • Amelia Glaese, Nat McAleese, Maja Trębacz, John Aslanides, Vlad Firoiu, Timo Ewalds, Maribeth Rauh, Laura Weidinger, Martin Chadwick, Phoebe Thacker, Lucy Campbell-Gillingham, Jonathan Uesato, Po-Sen Huang, Ramona Comanescu, Fan Yang, Abigail See, Sumanth Dathathri, Rory Greig, Charlie Chen, Doug Fritz, Jaume Sanchez Elias, Richard Green, Soňa Mokrá, Nicholas Fernando, Boxi Wu, Rachel Foley, Susannah Young, Iason Gabriel, William Isaac, John Mellor, Demis Hassabis, Koray Kavukcuoglu, Lisa Anne Hendricks, Geoffrey Irving

We present Sparrow, an information-seeking dialogue agent trained to be more helpful, correct, and harmless compared to prompted language model baselines.

Language Modelling

Paper
Add Code

Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models

no code implementations • 16 Jun 2022 • Maribeth Rauh, John Mellor, Jonathan Uesato, Po-Sen Huang, Johannes Welbl, Laura Weidinger, Sumanth Dathathri, Amelia Glaese, Geoffrey Irving, Iason Gabriel, William Isaac, Lisa Anne Hendricks

Large language models produce human-like text that drive a growing number of applications.

Benchmarking Language Modelling +1

Paper
Add Code

Competition-Level Code Generation with AlphaCode

1 code implementation • DeepMind 2022 • Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, Thomas Hubert, Peter Choy, Cyprien de Masson d'Autume, Igor Babuschkin, Xinyun Chen, Po-Sen Huang, Johannes Welbl, Sven Gowal, Alexey Cherepanov, James Molloy, Daniel J. Mankowitz, Esme Sutherland Robson, Pushmeet Kohli, Nando de Freitas, Koray Kavukcuoglu, Oriol Vinyals

Programming is a powerful and ubiquitous problem-solving tool.

Ranked #1 on Code Generation on CodeContests

Code Generation

2,015

Paper
Code

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

2 code implementations • NA 2021 • Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor, Irina Higgins, Antonia Creswell, Nat McAleese, Amy Wu, Erich Elsen, Siddhant Jayakumar, Elena Buchatskaya, David Budden, Esme Sutherland, Karen Simonyan, Michela Paganini, Laurent SIfre, Lena Martens, Xiang Lorraine Li, Adhiguna Kuncoro, Aida Nematzadeh, Elena Gribovskaya, Domenic Donato, Angeliki Lazaridou, Arthur Mensch, Jean-Baptiste Lespiau, Maria Tsimpoukelli, Nikolai Grigorev, Doug Fritz, Thibault Sottiaux, Mantas Pajarskas, Toby Pohlen, Zhitao Gong, Daniel Toyama, Cyprien de Masson d'Autume, Yujia Li, Tayfun Terzi, Vladimir Mikulik, Igor Babuschkin, Aidan Clark, Diego de Las Casas, Aurelia Guy, Chris Jones, James Bradbury, Matthew Johnson, Blake Hechtman, Laura Weidinger, Iason Gabriel, William Isaac, Ed Lockhart, Simon Osindero, Laura Rimell, Chris Dyer, Oriol Vinyals, Kareem Ayoub, Jeff Stanway, Lorrayne Bennett, Demis Hassabis, Koray Kavukcuoglu, Geoffrey Irving

Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world.

Ranked #1 on Language Modelling on StackExchange

Abstract Algebra Anachronisms +133

770

Paper
Code

Ethical and social risks of harm from Language Models

no code implementations • 8 Dec 2021 • Laura Weidinger, John Mellor, Maribeth Rauh, Conor Griffin, Jonathan Uesato, Po-Sen Huang, Myra Cheng, Mia Glaese, Borja Balle, Atoosa Kasirzadeh, Zac Kenton, Sasha Brown, Will Hawkins, Tom Stepleton, Courtney Biles, Abeba Birhane, Julia Haas, Laura Rimell, Lisa Anne Hendricks, William Isaac, Sean Legassick, Geoffrey Irving, Iason Gabriel

We discuss the points of origin of different risks and point to potential mitigation approaches.

Misinformation

Paper
Add Code

Challenges in Detoxifying Language Models

no code implementations • Findings (EMNLP) 2021 • Johannes Welbl, Amelia Glaese, Jonathan Uesato, Sumanth Dathathri, John Mellor, Lisa Anne Hendricks, Kirsty Anderson, Pushmeet Kohli, Ben Coppin, Po-Sen Huang

Large language models (LM) generate remarkably fluent text and can be efficiently adapted across NLP tasks.

Paper
Add Code

Self-supervised Adversarial Robustness for the Low-label, High-data Regime

no code implementations • ICLR 2021 • Sven Gowal, Po-Sen Huang, Aaron van den Oord, Timothy Mann, Pushmeet Kohli

Experiments on CIFAR-10 against $\ell_2$ and $\ell_\infty$ norm-bounded perturbations demonstrate that BYORL achieves near state-of-the-art robustness with as little as 500 labeled examples.

Adversarial Robustness Self-Supervised Learning +1

Paper
Add Code

Towards Verified Robustness under Text Deletion Interventions

no code implementations • ICLR 2020 • Johannes Welbl, Po-Sen Huang, Robert Stanforth, Sven Gowal, Krishnamurthy (Dj) Dvijotham, Martin Szummer, Pushmeet Kohli

Neural networks are widely used in Natural Language Processing, yet despite their empirical successes, their behaviour is brittle: they are both over-sensitive to small input changes, and under-sensitive to deletions of large fractions of input text.

Natural Language Inference

Paper
Add Code

Achieving Robustness in the Wild via Adversarial Mixing with Disentangled Representations

no code implementations • CVPR 2020 • Sven Gowal, Chongli Qin, Po-Sen Huang, Taylan Cemgil, Krishnamurthy Dvijotham, Timothy Mann, Pushmeet Kohli

Specifically, we leverage the disentangled latent representations computed by a StyleGAN model to generate perturbations of an image that are similar to real-world variations (like adding make-up, or changing the skin-tone of a person) and train models to be invariant to these perturbations.

Paper
Add Code

Towards Robust Image Classification Using Sequential Attention Models

no code implementations • CVPR 2020 • Daniel Zoran, Mike Chrzanowski, Po-Sen Huang, Sven Gowal, Alex Mott, Pushmeet Kohl

In this paper we propose to augment a modern neural-network architecture with an attention model inspired by human perception.

Adversarial Robustness Classification +2

Paper
Add Code

Reducing Sentiment Bias in Language Models via Counterfactual Evaluation

no code implementations • Findings of the Association for Computational Linguistics 2020 • Po-Sen Huang, huan zhang, Ray Jiang, Robert Stanforth, Johannes Welbl, Jack Rae, Vishal Maini, Dani Yogatama, Pushmeet Kohli

This paper aims to quantify and reduce a particular type of bias exhibited by language models: bias in the sentiment of generated text.

counterfactual Fairness +4

Paper
Add Code

Learning Transferable Graph Exploration

no code implementations • NeurIPS 2019 • Hanjun Dai, Yujia Li, Chenglong Wang, Rishabh Singh, Po-Sen Huang, Pushmeet Kohli

We propose a `learning to explore' framework where we learn a policy from a distribution of environments.

Efficient Exploration

Paper
Add Code

An Alternative Surrogate Loss for PGD-based Adversarial Testing

4 code implementations • 21 Oct 2019 • Sven Gowal, Jonathan Uesato, Chongli Qin, Po-Sen Huang, Timothy Mann, Pushmeet Kohli

Adversarial testing methods based on Projected Gradient Descent (PGD) are widely used for searching norm-bounded perturbations that cause the inputs of neural networks to be misclassified.

699

Paper
Code

Scalable Neural Learning for Verifiable Consistency with Temporal Specifications

no code implementations • 25 Sep 2019 • Sumanth Dathathri, Johannes Welbl, Krishnamurthy (Dj) Dvijotham, Ramana Kumar, Aditya Kanade, Jonathan Uesato, Sven Gowal, Po-Sen Huang, Pushmeet Kohli

Formal verification of machine learning models has attracted attention recently, and significant progress has been made on proving simple properties like robustness to small perturbations of the input features.

Adversarial Robustness Language Modelling

Paper
Add Code

Achieving Verified Robustness to Symbol Substitutions via Interval Bound Propagation

1 code implementation • IJCNLP 2019 • Po-Sen Huang, Robert Stanforth, Johannes Welbl, Chris Dyer, Dani Yogatama, Sven Gowal, Krishnamurthy Dvijotham, Pushmeet Kohli

Neural networks are part of many contemporary NLP systems, yet their empirical successes come at the price of vulnerability to adversarial attacks.

Data Augmentation text-classification +1

148

Paper
Code

Are Labels Required for Improving Adversarial Robustness?

1 code implementation • NeurIPS 2019 • Jonathan Uesato, Jean-Baptiste Alayrac, Po-Sen Huang, Robert Stanforth, Alhussein Fawzi, Pushmeet Kohli

Recent work has uncovered the interesting (and somewhat surprising) finding that training models to be invariant to adversarial perturbations requires substantially larger datasets than those required for standard classification.

4k Adversarial Robustness

12,799

Paper
Code

Knowing When to Stop: Evaluation and Verification of Conformity to Output-size Specifications

no code implementations • CVPR 2019 • Chenglong Wang, Rudy Bunel, Krishnamurthy Dvijotham, Po-Sen Huang, Edward Grefenstette, Pushmeet Kohli

This behavior can have severe consequences such as usage of increased computation and induce faults in downstream modules that expect outputs of a certain length.

Image Captioning Machine Translation +1

Paper
Add Code

Neural Phrase-to-Phrase Machine Translation

no code implementations • 6 Nov 2018 • Jiangtao Feng, Lingpeng Kong, Po-Sen Huang, Chong Wang, Da Huang, Jiayuan Mao, Kan Qiao, Dengyong Zhou

We also design an efficient dynamic programming algorithm to decode segments that allows the model to be trained faster than the existing neural phrase-based machine translation method by Huang et al. (2018).

Machine Translation Translation

Paper
Add Code

Robust Text-to-SQL Generation with Execution-Guided Decoding

1 code implementation • 9 Jul 2018 • Chenglong Wang, Kedar Tatwawadi, Marc Brockschmidt, Po-Sen Huang, Yi Mao, Oleksandr Polozov, Rishabh Singh

We consider the problem of neural semantic parsing, which translates natural language questions into executable SQL queries.

Semantic Parsing Text-To-SQL

129

Paper
Code

Discourse-Aware Neural Rewards for Coherent Text Generation

no code implementations • NAACL 2018 • Antoine Bosselut, Asli Celikyilmaz, Xiaodong He, Jianfeng Gao, Po-Sen Huang, Yejin Choi

In this paper, we investigate the use of discourse-aware rewards with reinforcement learning to guide a model to generate long, coherent text.

reinforcement-learning Reinforcement Learning (RL) +3

Paper
Add Code

Natural Language to Structured Query Generation via Meta-Learning

1 code implementation • NAACL 2018 • Po-Sen Huang, Chenglong Wang, Rishabh Singh, Wen-tau Yih, Xiaodong He

In conventional supervised training, a model is trained to fit all the training examples.

Ranked #7 on Code Generation on WikiSQL

Meta-Learning

129

Paper
Code

M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search

no code implementations • NeurIPS 2018 • Yelong Shen, Jianshu Chen, Po-Sen Huang, Yuqing Guo, Jianfeng Gao

In order to effectively train the agent from sparse rewards, we combine MCTS with the neural policy to generate trajectories yielding more positive rewards.

Ranked #44 on Link Prediction on WN18RR (Hits@3 metric)

Knowledge Base Completion Link Prediction +2

Paper
Add Code

Modeling Large-Scale Structured Relationships with Shared Memory for Knowledge Base Completion

no code implementations • WS 2017 • Yelong Shen, Po-Sen Huang, Ming-Wei Chang, Jianfeng Gao

However, due to the size of knowledge bases, learning multi-step relations directly on top of observed triplets could be costly.

Knowledge Base Completion Question Answering +1

Paper
Add Code

Two-Stage Synthesis Networks for Transfer Learning in Machine Comprehension

2 code implementations • EMNLP 2017 • David Golub, Po-Sen Huang, Xiaodong He, Li Deng

We develop a technique for transfer learning in machine comprehension (MC) using a novel two-stage synthesis network (SynNet).

Reading Comprehension Transfer Learning +1

110

Paper
Code

Towards Neural Phrase-based Machine Translation

4 code implementations • ICLR 2018 • Po-Sen Huang, Chong Wang, Sitao Huang, Dengyong Zhou, Li Deng

In this paper, we present Neural Phrase-based Machine Translation (NPMT).

Ranked #7 on Machine Translation on IWSLT2015 English-German

Machine Translation NMT +1

178

Paper
Code

Sequence Modeling via Segmentations

2 code implementations • ICML 2017 • Chong Wang, Yining Wang, Po-Sen Huang, Abdel-rahman Mohamed, Dengyong Zhou, Li Deng

The probability of a segmented sequence is calculated as the product of the probabilities of all its segments, where each segment is modeled using existing tools such as recurrent neural networks.

Segmentation speech-recognition +3

178

Paper
Code

Link Prediction using Embedded Knowledge Graphs

no code implementations • 14 Nov 2016 • Yelong Shen, Po-Sen Huang, Ming-Wei Chang, Jianfeng Gao

Since large knowledge bases are typically incomplete, missing facts need to be inferred from observed facts in a task called knowledge base completion.

Knowledge Base Completion Knowledge Graphs +1

Paper
Add Code

ReasoNet: Learning to Stop Reading in Machine Comprehension

no code implementations • 17 Sep 2016 • Yelong Shen, Po-Sen Huang, Jianfeng Gao, Weizhu Chen

Teaching a computer to read and answer general questions pertaining to a document is a challenging yet unsolved problem.

Ranked #7 on Question Answering on CNN / Daily Mail

Question Answering Reading Comprehension

Paper
Add Code

Unsupervised Learning of Predictors from Unpaired Input-Output Samples

no code implementations • 15 Jun 2016 • Jianshu Chen, Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng

In particular, we show that with regularization via a generative model, learning with the proposed unsupervised objective function converges to an optimal solution.

Paper
Add Code

Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation

2 code implementations • 13 Feb 2015 • Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, Paris Smaragdis

In this paper, we explore joint optimization of masking functions and deep recurrent neural networks for monaural source separation tasks, including monaural speech separation, monaural singing voice separation, and speech denoising.

Denoising Speech Denoising +1

151

Paper
Code

Deep learning for monaural speech separation

1 code implementation • ICASSP 2014 • Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, Paris Smaragdis

In this paper, we study deep learning for monaural speech separation.

Multi-Speaker Source Separation Speech Separation

365

Paper
Code

Learning deep structured semantic models for web search using clickthrough data

5 code implementations • CIKM 2013 • Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, Larry Heck

The proposed deep structured semantic models are discriminatively trained by maximizing the conditional likelihood of the clicked documents given a query using the clickthrough data.

Document Ranking

4,100

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.