Search Results for author: Yun-Hsuan Sung

Found 16 papers, 5 papers with code

Transforming LLMs into Cross-modal and Cross-lingual Retrieval Systems

no code implementations • 2 Apr 2024 • Frank Palma Gomez, Ramon Sanabria, Yun-Hsuan Sung, Daniel Cer, Siddharth Dalmia, Gustavo Hernandez Abrego

Our multi-modal LLM-based retrieval system is capable of matching speech and text in 102 languages despite only training on 21 languages.

Machine Translation Retrieval +1

Paper
Add Code

Characterizing Tradeoffs in Language Model Decoding with Informational Interpretations

no code implementations • 16 Nov 2023 • Chung-Ching Chang, William W. Cohen, Yun-Hsuan Sung

We propose a theoretical framework for formulating language model decoder algorithms with dynamic programming and information theory.

Language Modelling

Paper
Add Code

FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation

1 code implementation • 5 Oct 2023 • Tu Vu, Mohit Iyyer, Xuezhi Wang, Noah Constant, Jerry Wei, Jason Wei, Chris Tar, Yun-Hsuan Sung, Denny Zhou, Quoc Le, Thang Luong

Specifically, we introduce FreshQA, a novel dynamic QA benchmark encompassing a diverse range of question and answer types, including questions that require fast-changing world knowledge as well as questions with false premises that need to be debunked.

Hallucination World Knowledge

268

Paper
Code

KL-Divergence Guided Temperature Sampling

2 code implementations • 2 Jun 2023 • Chung-Ching Chang, David Reitter, Renat Aksitov, Yun-Hsuan Sung

One common approach to mitigate hallucinations is to provide source/grounding documents and the model is trained to produce predictions that bind to and are attributable to the provided source.

Conversational Question Answering Language Modelling +1

32,835

Paper
Code

CoLT5: Faster Long-Range Transformers with Conditional Computation

no code implementations • 17 Mar 2023 • Joshua Ainslie, Tao Lei, Michiel de Jong, Santiago Ontañón, Siddhartha Brahma, Yury Zemlyanskiy, David Uthus, Mandy Guo, James Lee-Thorp, Yi Tay, Yun-Hsuan Sung, Sumit Sanghai

Many natural language processing tasks benefit from long inputs, but processing long documents with Transformers is expensive -- not only due to quadratic attention complexity but also from applying feedforward and projection layers to every token.

Ranked #1 on Long-range modeling on SCROLLS

Long-range modeling

Paper
Add Code

Knowledge Prompts: Injecting World Knowledge into Language Models through Soft Prompts

no code implementations • 10 Oct 2022 • Cicero Nogueira dos santos, Zhe Dong, Daniel Cer, John Nham, Siamak Shakeri, Jianmo Ni, Yun-Hsuan Sung

The resulting soft knowledge prompts (KPs) are task independent and work as an external memory of the LMs.

Self-Supervised Learning World Knowledge

Paper
Add Code

LongT5: Efficient Text-To-Text Transformer for Long Sequences

1 code implementation • Findings (NAACL) 2022 • Mandy Guo, Joshua Ainslie, David Uthus, Santiago Ontanon, Jianmo Ni, Yun-Hsuan Sung, Yinfei Yang

Recent work has shown that either (1) increasing the input length or (2) increasing model size can improve the performance of Transformer-based neural models.

Ranked #1 on Text Summarization on BigPatent

Abstractive Text Summarization Long-range modeling +2

169

Paper
Code

Multilingual Universal Sentence Encoder for Semantic Retrieval

no code implementations • ACL 2020 • Yinfei Yang, Daniel Cer, Amin Ahmad, Mandy Guo, Jax Law, Noah Constant, Gustavo Hernandez Abrego, Steve Yuan, Chris Tar, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil

We introduce two pre-trained retrieval focused multilingual sentence encoding models, respectively based on the Transformer and CNN model architectures.

Question Answering Retrieval +6

Paper
Add Code

Hierarchical Document Encoder for Parallel Corpus Mining

no code implementations • WS 2019 • Mandy Guo, Yinfei Yang, Keith Stevens, Daniel Cer, Heming Ge, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil

We explore using multilingual document embeddings for nearest neighbor mining of parallel data.

Parallel Corpus Mining Sentence +2

Paper
Add Code

Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax

no code implementations • 22 Feb 2019 • Yinfei Yang, Gustavo Hernandez Abrego, Steve Yuan, Mandy Guo, Qinlan Shen, Daniel Cer, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil

On the UN document-level retrieval task, document embeddings achieve around 97% on P@1 for all experimented language pairs.

NMT Retrieval +3

Paper
Add Code

Learning Cross-Lingual Sentence Representations via a Multi-task Dual-Encoder Model

no code implementations • WS 2019 • Muthuraman Chidambaram, Yinfei Yang, Daniel Cer, Steve Yuan, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil

A significant roadblock in multilingual neural language modeling is the lack of labeled non-English data.

Few-Shot Learning Language Modelling +1

Paper
Add Code

Effective Parallel Corpus Mining using Bilingual Sentence Embeddings

no code implementations • WS 2018 • Mandy Guo, Qinlan Shen, Yinfei Yang, Heming Ge, Daniel Cer, Gustavo Hernandez Abrego, Keith Stevens, Noah Constant, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil

This paper presents an effective approach for parallel corpus mining using bilingual sentence embeddings.

Machine Translation NMT +6

Paper
Add Code

Learning Semantic Textual Similarity from Conversations

1 code implementation • WS 2018 • Yinfei Yang, Steve Yuan, Daniel Cer, Sheng-yi Kong, Noah Constant, Petr Pilar, Heming Ge, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil

We present a novel approach to learn representations for sentence-level semantic similarity using conversational data.

Community Question Answering Natural Language Inference +6

Paper
Code

Universal Sentence Encoder

23 code implementations • 29 Mar 2018 • Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St. John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil

For both variants, we investigate and report the relationship between model complexity, resource consumption, the availability of transfer task training data, and task performance.

Ranked #2 on Text Classification on TREC-6

Conversational Response Selection Semantic Textual Similarity +7

2,279

Paper
Code

Efficient Natural Language Response Suggestion for Smart Reply

no code implementations • 1 May 2017 • Matthew Henderson, Rami Al-Rfou, Brian Strope, Yun-Hsuan Sung, Laszlo Lukacs, Ruiqi Guo, Sanjiv Kumar, Balint Miklos, Ray Kurzweil

This paper presents a computationally efficient machine-learned method for natural language response suggestion.

Paper
Add Code

Conversational Contextual Cues: The Case of Personalization and History for Response Ranking

no code implementations • 1 Jun 2016 • Rami Al-Rfou, Marc Pickett, Javier Snaider, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil

Unlike previous efforts, which focused on modeling messages and responses, we extend the modeling to long context and participant's history.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.