1 code implementation • 11 Dec 2023 • Zhengzhong Liu, Aurick Qiao, Willie Neiswanger, Hongyi Wang, Bowen Tan, Tianhua Tao, Junbo Li, Yuqi Wang, Suqi Sun, Omkar Pangarkar, Richard Fan, Yi Gu, Victor Miller, Yonghao Zhuang, Guowei He, Haonan Li, Fajri Koto, Liping Tang, Nikhil Ranjan, Zhiqiang Shen, Xuguang Ren, Roberto Iriondo, Cun Mu, Zhiting Hu, Mark Schulze, Preslav Nakov, Tim Baldwin, Eric P. Xing
The recent surge in open-source Large Language Models (LLMs), such as LLaMA, Falcon, and Mistral, provides diverse options for AI practitioners and researchers.
1 code implementation • 25 Oct 2023 • Bowen Tan, Yun Zhu, Lijuan Liu, Hongyi Wang, Yonghao Zhuang, Jindong Chen, Eric Xing, Zhiting Hu
In this work, we present RedCoast(Redco), a lightweight and user-friendly tool crafted to automate distributed training and inference for LLMs, as well as to simplify ML pipeline development.
no code implementations • 19 Sep 2023 • Zhiqiang Shen, Tianhua Tao, Liqun Ma, Willie Neiswanger, Zhengzhong Liu, Hongyi Wang, Bowen Tan, Joel Hestness, Natalia Vassilieva, Daria Soboleva, Eric Xing
This paper aims to understand the impacts of various data combinations (e. g., web text, wikipedia, github, books) on the training of large language models using SlimPajama.
1 code implementation • 28 Jun 2022 • Shibo Hao, Bowen Tan, Kaiwen Tang, Bin Ni, Xiyan Shao, Hengzhe Zhang, Eric P. Xing, Zhiting Hu
The resulting KGs as a symbolic interpretation of the source LMs also reveal new insights into the LMs' knowledge capacities.
no code implementations • 29 Sep 2021 • Han Guo, Bowen Tan, Zhengzhong Liu, Eric Xing, Zhiting Hu
We apply the approach to a wide range of text generation tasks, including learning from noisy/negative examples, adversarial attacks, and prompt generation.
1 code implementation • EMNLP 2021 • Mingkai Deng, Bowen Tan, Zhengzhong Liu, Eric P. Xing, Zhiting Hu
Based on the nature of information change from input to output, we classify NLG tasks into compression (e. g., summarization), transduction (e. g., text rewriting), and creation (e. g., dialog).
1 code implementation • ACL 2021 • Meng Zhou, Zechen Li, Bowen Tan, Guangtao Zeng, Wenmian Yang, Xuehai He, Zeqian Ju, Subrato Chakravorty, Shu Chen, Xingyi Yang, Yichen Zhang, Qingyang Wu, Zhou Yu, Kun Xu, Eric Xing, Pengtao Xie
Training complex dialog generation models on small datasets bears high risk of overfitting.
1 code implementation • 14 Jun 2021 • Han Guo, Bowen Tan, Zhengzhong Liu, Eric P. Xing, Zhiting Hu
We apply the approach to a wide range of novel text generation tasks, including learning from noisy/negative examples, adversarial attacks, and prompt generation.
1 code implementation • EMNLP 2020 • Bowen Tan, Lianhui Qin, Eric P. Xing, Zhiting Hu
Given a document and a target aspect (e. g., a topic of interest), aspect-based abstractive summarization attempts to generate a summary with respect to the aspect.
1 code implementation • NAACL 2021 • Bowen Tan, Zichao Yang, Maruan AI-Shedivat, Eric P. Xing, Zhiting Hu
However, as our systematic examination reveals, it is still challenging for such models to generate coherent long passages of text (e. g., 1000 tokens), especially when the models are fine-tuned to the target domain on a small corpus.
1 code implementation • 3 Jun 2020 • Virapat Kieuvongngam, Bowen Tan, Yiming Niu
With the COVID-19 pandemic, there is a growing urgency for medical community to keep up with the accelerating growth in the new coronavirus-related literature.
1 code implementation • 11 May 2020 • Wenmian Yang, Guangtao Zeng, Bowen Tan, Zeqian Ju, Subrato Chakravorty, Xuehai He, Shu Chen, Xingyi Yang, Qingyang Wu, Zhou Yu, Eric Xing, Pengtao Xie
On these two datasets, we train several dialogue generation models based on Transformer, GPT, and BERT-GPT.
no code implementations • 3 Apr 2020 • Lu Chen, Boer Lv, Chi Wang, Su Zhu, Bowen Tan, Kai Yu
For multi-domain DST, the data sparsity problem is also a major obstacle due to the increased number of state candidates.
Ranked #12 on Multi-domain Dialogue State Tracking on MULTIWOZ 2.1
2 code implementations • NeurIPS 2019 • Zhiting Hu, Bowen Tan, Ruslan Salakhutdinov, Tom Mitchell, Eric P. Xing
In this work, we propose a new method that supports learning different manipulation schemes with the same gradient-based algorithm.
no code implementations • 27 May 2019 • Lu Chen, Zhi Chen, Bowen Tan, Sishan Long, Milica Gasic, Kai Yu
Experiments show that AgentGraph models significantly outperform traditional reinforcement learning approaches on most of the 18 tasks of the PyDial benchmark.
no code implementations • 24 Nov 2018 • Bowen Tan, Zhiting Hu, Zichao Yang, Ruslan Salakhutdinov, Eric Xing
Reinforcement learning such as policy gradient addresses the issue but can have prohibitively poor exploration efficiency.
4 code implementations • ACL 2019 • Zhiting Hu, Haoran Shi, Bowen Tan, Wentao Wang, Zichao Yang, Tiancheng Zhao, Junxian He, Lianhui Qin, Di Wang, Xuezhe Ma, Zhengzhong Liu, Xiaodan Liang, Wangrong Zhu, Devendra Singh Sachan, Eric P. Xing
The versatile toolkit also fosters technique sharing across different text generation tasks.
no code implementations • COLING 2018 • Lu Chen, Bowen Tan, Sishan Long, Kai Yu
The proposed structured deep reinforcement learning is based on graph neural networks (GNN), which consists of some sub-networks, each one for a node on a directed graph.
no code implementations • WS 2018 • Zhiting Hu, Zichao Yang, Tiancheng Zhao, Haoran Shi, Junxian He, Di Wang, Xuezhe Ma, Zhengzhong Liu, Xiaodan Liang, Lianhui Qin, Devendra Singh Chaplot, Bowen Tan, Xingjiang Yu, Eric Xing
The features make Texar particularly suitable for technique sharing and generalization across different text generation applications.
no code implementations • 13 Jan 2018 • Nayun Xu, Bowen Tan, Bingyu Kong
Supervised learning is widely used in training autonomous driving vehicle.