1 code implementation • 25 Jul 2023 • Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig
Building upon our environment, we release a set of benchmark tasks focusing on evaluating the functional correctness of task completions.
3 code implementations • 23 May 2023 • Abishek Sridhar, Robert Lo, Frank F. Xu, Hao Zhu, Shuyan Zhou
Large language models (LLMs) struggle on processing complicated observations in interactive decision making tasks.
1 code implementation • 11 May 2023 • Zhengbao Jiang, Frank F. Xu, Luyu Gao, Zhiqing Sun, Qian Liu, Jane Dwivedi-Yu, Yiming Yang, Jamie Callan, Graham Neubig
In this work, we provide a generalized view of active retrieval augmented generation, methods that actively decide when and what to retrieve across the course of the generation.
1 code implementation • 7 Jan 2023 • Frank F. Xu, Uri Alon, Graham Neubig
Language models (LMs) compute the probability of a text by sequentially computing a representation of an already-seen context and using this representation to predict the next word.
2 code implementations • 13 Jul 2022 • Shuyan Zhou, Uri Alon, Frank F. Xu, Zhiruo Wang, Zhengbao Jiang, Graham Neubig
Publicly available source-code libraries are continuously growing and changing.
1 code implementation • 16 Mar 2022 • Zhiruo Wang, Grace Cuenca, Shuyan Zhou, Frank F. Xu, Graham Neubig
While there has been a recent burgeoning of applications at the intersection of natural and programming languages, such as code generation and code summarization, these applications are usually English-centric.
3 code implementations • 26 Feb 2022 • Frank F. Xu, Uri Alon, Graham Neubig, Vincent J. Hellendoorn
We aim to fill in some of these blanks through a systematic evaluation of the largest existing models: Codex, GPT-J, GPT-Neo, GPT-NeoX-20B, and CodeParrot, across various programming languages.
2 code implementations • 28 Jan 2022 • Uri Alon, Frank F. Xu, Junxian He, Sudipta Sengupta, Dan Roth, Graham Neubig
Retrieval-based language models (R-LM) model the probability of natural language text by combining a standard language model (LM) with examples retrieved from an external datastore at test time.
no code implementations • ICLR 2022 • Frank F. Xu, Junxian He, Graham Neubig, Vincent J. Hellendoorn
Structural locality is a ubiquitous feature of real-world datasets, wherein data points are organized into local hierarchies.
1 code implementation • ICLR 2021 • Ziyu Yao, Frank F. Xu, Pengcheng Yin, Huan Sun, Graham Neubig
To show the unique benefits of modeling tree edits directly, we further propose a novel edit encoder for learning to represent edits, as well as an imitation learning method that allows the editor to be more robust.
no code implementations • 27 Jan 2021 • Frank F. Xu, Bogdan Vasilescu, Graham Neubig
A great part of software development involves conceptualizing or communicating the underlying procedures and logic that needs to be expressed in programs.
Code Generation Data Visualization Software Engineering
1 code implementation • EMNLP (nlpbt) 2020 • Frank F. Xu, Lei Ji, Botian Shi, Junyi Du, Graham Neubig, Yonatan Bisk, Nan Duan
Watching instructional videos are often used to learn about procedures.
1 code implementation • 1 May 2020 • Yu Zhang, Yu Meng, Jiaxin Huang, Frank F. Xu, Xuan Wang, Jiawei Han
Then, based on the same generative process, we synthesize training samples to address the bottleneck of label scarcity.
2 code implementations • ACL 2020 • Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, Graham Neubig
Open-domain code generation aims to generate code in a general-purpose programming language (such as Python) from natural language (NL) intents.
Ranked #3 on Code Generation on CoNaLa-Ext
1 code implementation • TACL 2020 • Zhengbao Jiang, Frank F. Xu, Jun Araki, Graham Neubig
Recent work has presented intriguing results examining the knowledge contained in language models (LM) by having the LM fill in the blanks of prompts such as "Obama is a _ by profession".
2 code implementations • 16 Oct 2019 • Yu Zhang, Frank F. Xu, Sha Li, Yu Meng, Xuan Wang, Qi Li, Jiawei Han
With the massive number of repositories available, there is a pressing need for topic-based search.
1 code implementation • 20 Aug 2019 • Anhong Guo, Junhan Kong, Michael Rivera, Frank F. Xu, Jeffrey P. Bigham
Second, using the state diagrams, StateLens automatically generates conversational agents to guide blind users through specifying the tasks that the interface can perform, allowing the StateLens iOS application to provide interactive guidance and feedback so that blind users can access the interface.
no code implementations • ACL 2019 • Bill Yuchen Lin, Dong-Ho Lee, Frank F. Xu, Ouyu Lan, Xiang Ren
We introduce an open-source web-based data annotation framework (AlpacaTag) for sequence tagging tasks such as named-entity recognition (NER).
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Shuai Lin, Wentao Wang, Zichao Yang, Xiaodan Liang, Frank F. Xu, Eric Xing, Zhiting Hu
That is, the model learns to imitate the writing style of any given exemplar sentence, with automatic adaptions to faithfully describe the content record.
1 code implementation • EMNLP 2018 • Zhiyi Luo, Shanshan Huang, Frank F. Xu, Bill Yuchen Lin, Hanyuan Shi, Kenny Zhu
Many existing systems for analyzing and summarizing customer reviews about products or service are based on a number of prominent review aspects.
no code implementations • ACL 2018 • Bill Yuchen Lin, Frank F. Xu, Kenny Zhu, Seung-won Hwang
Cross-cultural differences and similarities are common in cross-lingual natural language understanding, especially for research in social media.
1 code implementation • ACL 2018 • Frank F. Xu, Bill Yuchen Lin, Kenny Q. Zhu
LocatedNear relation is a kind of commonsense knowledge describing two physical objects that are typically found near each other in real life.
2 code implementations • 30 Oct 2017 • Zeqiu Wu, Xiang Ren, Frank F. Xu, Ji Li, Jiawei Han
However, due to the incompleteness of knowledge bases and the context-agnostic labeling, the training data collected via distant supervision (DS) can be very noisy.
3 code implementations • 13 Sep 2017 • Liyuan Liu, Jingbo Shang, Frank F. Xu, Xiang Ren, Huan Gui, Jian Peng, Jiawei Han
In this study, we develop a novel neural framework to extract abundant knowledge hidden in raw texts to empower the sequence labeling task.
Ranked #13 on Part-Of-Speech Tagging on Penn Treebank