no code implementations • 22 Apr 2024 • Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Chen Liang, Weishung Liu, Eric Lin, Zeqi Lin, Piyush Madan, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Xia Song, Masahiro Tanaka, Xin Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Michael Wyatt, Can Xu, Jiahang Xu, Sonali Yadav, Fan Yang, ZiYi Yang, Donghan Yu, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou
We introduce phi-3-mini, a 3. 8 billion parameter language model trained on 3. 3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3. 5 (e. g., phi-3-mini achieves 69% on MMLU and 8. 38 on MT-bench), despite being small enough to be deployed on a phone.
no code implementations • 30 Mar 2024 • Xiaoyang Lyu, Yang-tian Sun, Yi-Hua Huang, Xiuzhe Wu, ZiYi Yang, Yilun Chen, Jiangmiao Pang, Xiaojuan Qi
In this paper, we present an implicit surface reconstruction method with 3D Gaussian Splatting (3DGS), namely 3DGSR, that allows for accurate 3D reconstruction with intricate details while inheriting the high efficiency and rendering quality of 3DGS.
no code implementations • 12 Mar 2024 • Juan Manuel Zambrano Chaves, Shih-Cheng Huang, Yanbo Xu, Hanwen Xu, Naoto Usuyama, Sheng Zhang, Fei Wang, Yujia Xie, Mahmoud Khademi, ZiYi Yang, Hany Awadalla, Julia Gong, Houdong Hu, Jianwei Yang, Chunyuan Li, Jianfeng Gao, Yu Gu, Cliff Wong, Mu Wei, Tristan Naumann, Muhao Chen, Matthew P. Lungren, Serena Yeung-Levy, Curtis P. Langlotz, Sheng Wang, Hoifung Poon
Frontier general-domain models such as GPT-4V still have significant performance gaps in multimodal biomedical applications.
1 code implementation • 25 Feb 2024 • Fanqi Wan, ZiYi Yang, Longguang Zhong, Xiaojun Quan, Xinting Huang, Wei Bi
Recently, \textsc{FuseLLM} introduced the concept of knowledge fusion to transfer the collective knowledge of multiple structurally varied LLMs into a target LLM through lightweight continual training.
no code implementations • 24 Feb 2024 • ZiYi Yang, Xinyu Gao, Yangtian Sun, Yihua Huang, Xiaoyang Lyu, Wen Zhou, Shaohui Jiao, Xiaojuan Qi, Xiaogang Jin
The recent advancements in 3D Gaussian splatting (3D-GS) have not only facilitated real-time rendering through modern GPU rasterization pipelines but have also attained state-of-the-art rendering quality.
1 code implementation • 4 Dec 2023 • Yi-Hua Huang, Yang-tian Sun, ZiYi Yang, Xiaoyang Lyu, Yan-Pei Cao, Xiaojuan Qi
During learning, the location and number of control points are adaptively adjusted to accommodate varying motion complexities in different regions, and an ARAP loss following the principle of as rigid as possible is developed to enforce spatial continuity and local rigidity of learned motions.
no code implementations • 30 Nov 2023 • Zineng Tang, ZiYi Yang, Mahmoud Khademi, Yang Liu, Chenguang Zhu, Mohit Bansal
We present CoDi-2, a versatile and interactive Multimodal Large Language Model (MLLM) that can follow complex multimodal interleaved instructions, conduct in-context learning (ICL), reason, chat, edit, etc., in an any-to-any input-output modality paradigm.
1 code implementation • 19 Oct 2023 • ZiYi Yang, Yanzhen Chen, Xinyu Gao, Yazhen Yuan, Yu Wu, Xiaowei Zhou, Xiaogang Jin
Implicit neural representation has opened up new possibilities for inverse rendering.
no code implementations • 4 Oct 2023 • Tanmay Gautam, Reid Pryzant, ZiYi Yang, Chenguang Zhu, Somayeh Sojoudi
SCQ works like a differentiable convex optimization (DCO) layer: in the forward pass, we solve for the optimal convex combination of codebook vectors that quantize the inputs.
1 code implementation • 22 Sep 2023 • ZiYi Yang, Xinyu Gao, Wen Zhou, Shaohui Jiao, Yuqing Zhang, Xiaogang Jin
Implicit neural representation has paved the way for new approaches to dynamic scene reconstruction and rendering.
no code implementations • 18 Sep 2023 • ZiYi Yang, Shreyas S. Raman, Ankit Shah, Stefanie Tellex
Recent advancements in large language models (LLMs) have enabled a new research domain, LLM agents, for solving robotics and planning tasks by leveraging the world knowledge and general reasoning abilities of LLMs obtained during pretraining.
no code implementations • 9 Aug 2023 • Xinyu Gao, ZiYi Yang, Yunlu Zhao, Yuxiang Sun, Xiaogang Jin, Changqing Zou
Mainly, our work introduces a new surface representation known as Neural Depth Fields (NeDF) that quickly determines the spatial relationship between objects by allowing direct intersection computation between rays and implicit surfaces.
1 code implementation • 8 Jun 2023 • Jiaxian Yan, Zhaofeng Ye, ZiYi Yang, Chengqiang Lu, Shengyu Zhang, Qi Liu, Jiezhong Qiu
By introducing multi-task pre-training to treat the prediction of different affinity labels as different tasks and classifying relative rankings between samples from the same bioassay, MBP learns robust and transferrable structural knowledge from our new ChEMBL-Dock dataset with varied and noisy labels.
no code implementations • 23 May 2023 • Yuwei Fang, Mahmoud Khademi, Chenguang Zhu, ZiYi Yang, Reid Pryzant, Yichong Xu, Yao Qian, Takuya Yoshioka, Lu Yuan, Michael Zeng, Xuedong Huang
Artificial General Intelligence (AGI) requires comprehensive understanding and generation capabilities for a variety of tasks spanning different modalities and functionalities.
no code implementations • 21 May 2023 • ZiYi Yang, Mahmoud Khademi, Yichong Xu, Reid Pryzant, Yuwei Fang, Chenguang Zhu, Dongdong Chen, Yao Qian, Mei Gao, Yi-Ling Chen, Robert Gmyr, Naoyuki Kanda, Noel Codella, Bin Xiao, Yu Shi, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang
The convergence of text, visual, and audio data is a key step towards human-like artificial intelligence, however the current Vision-Language-Speech landscape is dominated by encoder-only models which lack generative abilities.
1 code implementation • NeurIPS 2023 • Zineng Tang, ZiYi Yang, Chenguang Zhu, Michael Zeng, Mohit Bansal
We present Composable Diffusion (CoDi), a novel generative model capable of generating any combination of output modalities, such as language, image, video, or audio, from any combination of input modalities.
Ranked #7 on Audio Generation on AudioCaps
no code implementations • 13 Mar 2023 • Zirun Zhu, Hemin Yang, Min Tang, ZiYi Yang, Sefik Emre Eskimez, Huaming Wang
In this paper, we propose a low-latency real-time audio-visual end-to-end enhancement (AV-E3Net) model based on the recently proposed end-to-end enhancement network (E3Net).
no code implementations • 22 Feb 2023 • Jason Xinyu Liu, ZiYi Yang, Ifrah Idrees, Sam Liang, Benjamin Schornstein, Stefanie Tellex, Ankit Shah
We propose Lang2LTL, a modular system and a software package that leverages large language models (LLMs) to ground temporal navigational commands to LTL specifications in environments without prior language data.
no code implementations • 19 Dec 2022 • Soumya Sanyal, Yichong Xu, Shuohang Wang, ZiYi Yang, Reid Pryzant, Wenhao Yu, Chenguang Zhu, Xiang Ren
Logical reasoning of text is an important ability that requires understanding the information present in the text, their interconnections, and then reasoning through them to infer new conclusions.
2 code implementations • CVPR 2023 • Zineng Tang, ZiYi Yang, Guoxin Wang, Yuwei Fang, Yang Liu, Chenguang Zhu, Michael Zeng, Cha Zhang, Mohit Bansal
UDOP leverages the spatial correlation between textual content and document image to model image, text, and layout modalities with one uniform representation.
Ranked #5 on Visual Question Answering (VQA) on InfographicVQA (using extra training data)
1 code implementation • 17 Nov 2022 • Yulong Chen, Yang Liu, Ruochen Xu, ZiYi Yang, Chenguang Zhu, Michael Zeng, Yue Zhang
The high annotation costs and diverse demands of various summarization tasks motivate the development of few-shot summarization.
no code implementations • 15 Nov 2022 • Ziniu Hu, Yichong Xu, Wenhao Yu, Shuohang Wang, ZiYi Yang, Chenguang Zhu, Kai-Wei Chang, Yizhou Sun
Answering open-domain questions requires world knowledge about in-context entities.
1 code implementation • 9 Nov 2022 • Yusen Zhang, Yang Liu, ZiYi Yang, Yuwei Fang, Yulong Chen, Dragomir Radev, Chenguang Zhu, Michael Zeng, Rui Zhang
We propose two simple and effective parameter-efficient approaches for the new task of mixed controllable summarization based on hard prompt tuning and soft prefix tuning.
1 code implementation • 23 Oct 2022 • Vin Sachidananda, ZiYi Yang, Chenguang Zhu
Contrastive Learning has recently achieved state-of-the-art performance in a wide range of tasks.
1 code implementation • 22 May 2022 • Zhenhailong Wang, Manling Li, Ruochen Xu, Luowei Zhou, Jie Lei, Xudong Lin, Shuohang Wang, ZiYi Yang, Chenguang Zhu, Derek Hoiem, Shih-Fu Chang, Mohit Bansal, Heng Ji
The goal of this work is to build flexible video-language models that can generalize to various video-to-text tasks from few examples, such as domain-specific captioning, question answering, and future event prediction.
2 code implementations • 19 May 2022 • Lixue Cheng, ZiYi Yang, ChangYu Hsieh, Benben Liao, Shengyu Zhang
Directed evolution is a versatile technique in protein engineering that mimics the process of natural selection by iteratively alternating between mutagenesis and screening in order to search for sequences that optimize a given property of interest, such as catalytic activity and binding affinity to a specified target.
1 code implementation • 18 May 2022 • Reid Pryzant, ZiYi Yang, Yichong Xu, Chenguang Zhu, Michael Zeng
Semi-supervised learning has shown promise in allowing NLP models to generalize from small amounts of labeled data.
no code implementations • 3 May 2022 • ZiYi Yang, Yuwei Fang, Chenguang Zhu, Reid Pryzant, Dongdong Chen, Yu Shi, Yichong Xu, Yao Qian, Mei Gao, Yi-Ling Chen, Liyang Lu, Yujia Xie, Robert Gmyr, Noel Codella, Naoyuki Kanda, Bin Xiao, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang
Human intelligence is multimodal; we integrate visual, linguistic, and acoustic signals to maintain a holistic worldview.
no code implementations • 20 Jan 2022 • Haojie Huang, ZiYi Yang, Robert Platt
Shape completion, the problem of inferring the complete geometry of an object given a partial point cloud, is an important problem in robotics and computer vision.
no code implementations • 15 Nov 2021 • ZiYi Yang, Zhaofeng Ye, Yijia Xiao, ChangYu Hsieh, Shengyu Zhang
Drug resistance is a major threat to the global health and a significant concern throughout the clinical treatment of diseases and drug development.
no code implementations • 9 Oct 2021 • Seth Pate, Wei Xu, ZiYi Yang, Maxwell Love, Siddarth Ganguri, Lawson L. S. Wong
To enable robots to instruct humans in collaborations, we identify several aspects of language processing that are not commonly studied in this context.
1 code implementation • EMNLP 2021 • ZiYi Yang, Yinfei Yang, Daniel Cer, Eric Darve
A simple but highly effective method "Language Information Removal (LIR)" factors out language identity information from semantic related components in multilingual representations pre-trained on multi-monolingual data.
no code implementations • 1 Jan 2021 • ZiYi Yang, Yinfei Yang, Daniel M Cer, Jax Law, Eric Darve
This paper presents a novel training method, Conditional Masked Language Modeling (CMLM), to effectively learn sentence representations on large scale unlabeled corpora.
no code implementations • EMNLP 2021 • ZiYi Yang, Yinfei Yang, Daniel Cer, Jax Law, Eric Darve
This paper presents a novel training method, Conditional Masked Language Modeling (CMLM), to effectively learn sentence representations on large scale unlabeled corpora.
no code implementations • 11 Oct 2020 • Qi Zhang, Yilin Chen, ZiYi Yang, Eric Darve
We propose a novel method "multi-constitutive neural network" (MCNN) such that one model can solve several different constitutive laws.
no code implementations • 2 Sep 2020 • Ziyi Yang, Jun Shu, Yong Liang, Deyu Meng, Zongben Xu
Current machine learning has made great progress on computer vision and many other fields attributed to the large amount of high-quality training samples, while it does not work very well on genomic data analysis, since they are notoriously known as small data.
no code implementations • 5 Jun 2020 • Ziyi Yang, Iman Soltani Bozchalooi, Eric Darve
We study the problem of semi-supervised anomaly detection with domain adaptation.
no code implementations • ICLR 2021 • Vin Sachidananda, ZiYi Yang, Chenguang Zhu
Due to widespread interest in machine translation and transfer learning, there are numerous algorithms for mapping multiple embeddings to a shared representation space.
no code implementations • 7 Feb 2020 • Ziyi Yang, Teng Zhang, Iman Soltani Bozchalooi, Eric Darve
Decoded memory units in MEMGAN are more interpretable and disentangled than previous methods, which further demonstrates the effectiveness of the memory mechanism.
no code implementations • 18 Jan 2020 • Ziyi Yang, Iman Soltani Bozchalooi, Eric Darve
In this paper, we investigate algorithms for anomaly detection.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Ziyi Yang, Chenguang Zhu, Robert Gmyr, Michael Zeng, Xuedong Huang, Eric Darve
Text summarization aims to extract essential information from a piece of text and transform the text into a concise version.
no code implementations • 25 Dec 2019 • Chenguang Zhu, Ziyi Yang, Robert Gmyr, Michael Zeng, Xuedong Huang
A typical journalistic convention in news articles is to deliver the most salient information in the beginning, also known as the lead bias.
no code implementations • 25 Sep 2019 • Chenguang Zhu, ZiYi Yang, Robert Gmyr, Michael Zeng, Xuedong Huang
For example, the pretrained model without finetuning outperforms pointer-generator network on CNN/DailyMail dataset.
1 code implementation • ACL 2019 • Ziyi Yang, Chenguang Zhu, Sachidan, Vin a, Eric Darve
In this paper, we propose an approach for embedding imputation which uses grounded information in the form of a knowledge graph.
no code implementations • ACL 2019 • Ziyi Yang, Chenguang Zhu, Vin Sachidananda, Eric Darve
In this paper, we propose an approach for embedding imputation which uses grounded information in the form of a knowledge graph.
1 code implementation • ICLR 2019 • Ziyi Yang, Chenguang Zhu, Weizhu Chen
We model the semantic meaning of a word in a sentence based on two aspects.
1 code implementation • IJCNLP 2019 • Ziyi Yang, Chenguang Zhu, Weizhu Chen
Inspired by the Gram-Schmidt Process in geometric theory, we build an orthogonal basis of the subspace spanned by a word and its surrounding context in a sentence.
no code implementations • 26 Feb 2018 • XingYu Fu, ZiYi Yang, XiuWen Duan
To model the randomness of language spreading, we propose the Batch Markov Monte Carlo Simulation with Migration(BMMCSM) algorithm, in which each agent is treated as a language stack.