1 code implementation • 20 Mar 2024 • Peng Zhou, Jianmin Wang, Chunyan Li, Zixu Wang, Yiping Liu, Siqi Sun, Jianxin Lin, Longyue Wang, Xiangxiang Zeng
While various models and computational tools have been proposed for structure and property analysis of molecules, generating molecules that conform to all desired structures and properties remains a challenge.
1 code implementation • 19 Dec 2023 • Linglin Jing, Sheng Xu, Yifan Wang, Yuzhe Zhou, Tao Shen, Zhigang Ji, Hui Fang, Zhen Li, Siqi Sun
Accurate identification of protein nucleic-acid-binding residues poses a significant challenge with important implications for various biological processes and drug design.
1 code implementation • 18 Dec 2023 • Zhi Jin, Sheng Xu, Xiang Zhang, Tianze Ling, Nanqing Dong, Wanli Ouyang, Zhiqiang Gao, Cheng Chang, Siqi Sun
De novo peptide sequencing from mass spectrometry (MS) data is a critical task in proteomics research.
1 code implementation • 31 Aug 2023 • Hongtai Jing, Zhengtao Gao, Sheng Xu, Tao Shen, Zhangzhi Peng, Shwai He, Tao You, Shuang Ye, Wei Lin, Siqi Sun
Remarkably, BALMFold outperforms those well-established methods like AlphaFold2, IgFold, ESMFold, and OmegaFold in the antibody benchmark, demonstrating significant potential to advance innovative engineering and streamline therapeutic antibody development by reducing the need for unnecessary trials.
2 code implementations • 2 Jun 2023 • Le Zhang, Jiayang Chen, Tao Shen, Yu Li, Siqi Sun
The field of protein folding research has been greatly advanced by deep learning methods, with AlphaFold2 (AF2) demonstrating exceptional performance and atomic-level precision.
no code implementations • 15 May 2023 • Zhongju Yuan, Tao Shen, Sheng Xu, Leiye Yu, Ruobing Ren, Siqi Sun
Deep learning-based approaches, such as AlphaFold2 (AF2), have significantly advanced protein tertiary structure prediction, achieving results comparable to real biological experimental methods.
no code implementations • 4 Jul 2022 • Tao Shen, Zhihang Hu, Zhangzhi Peng, Jiayang Chen, Peng Xiong, Liang Hong, Liangzhen Zheng, YiXuan Wang, Irwin King, Sheng Wang, Siqi Sun, Yu Li
When E2Efold-3D is coupled with the experimental techniques, the RNA structure prediction field can be greatly advanced.
1 code implementation • 1 Apr 2022 • Jiayang Chen, Zhihang Hu, Siqi Sun, Qingxiong Tan, YiXuan Wang, Qinze Yu, Licheng Zong, Liang Hong, Jin Xiao, Tao Shen, Irwin King, Yu Li
Non-coding RNA structure and function are essential to understanding various biological processes, such as cell signaling, gene expression, and post-transcriptional regulations.
1 code implementation • ACL 2022 • Shuohang Wang, Yichong Xu, Yuwei Fang, Yang Liu, Siqi Sun, Ruochen Xu, Chenguang Zhu, Michael Zeng
Surprisingly, we found that REtrieving from the traINing datA (REINA) only can lead to significant gains on multiple NLG and NLU tasks.
2 code implementations • 6 Dec 2021 • Yichong Xu, Chenguang Zhu, Shuohang Wang, Siqi Sun, Hao Cheng, Xiaodong Liu, Jianfeng Gao, Pengcheng He, Michael Zeng, Xuedong Huang
In particular, we focus on the task of Commonsense Reasoning, demonstrating that the proposed external attention mechanism can augment existing transformer models and significantly improve the model's reasoning capabilities.
Ranked #1 on Common Sense Reasoning on CommonsenseQA (using extra training data)
no code implementations • Findings (ACL) 2022 • Yuwei Fang, Shuohang Wang, Yichong Xu, Ruochen Xu, Siqi Sun, Chenguang Zhu, Michael Zeng
Then we utilize a diverse of 4 English knowledge sources to provide more comprehensive coverage of knowledge in different formats.
1 code implementation • 14 May 2021 • Yizhe Zhang, Siqi Sun, Xiang Gao, Yuwei Fang, Chris Brockett, Michel Galley, Jianfeng Gao, Bill Dolan
We propose a framework that alleviates this data constraint by jointly training a grounded generator and document retriever on the language model signal.
2 code implementations • NAACL 2021 • Siqi Sun, Yen-Chun Chen, Linjie Li, Shuohang Wang, Yuwei Fang, Jingjing Liu
Multimodal pre-training has propelled great advancement in vision-and-language research.
no code implementations • Findings (ACL) 2021 • Shuohang Wang, Luowei Zhou, Zhe Gan, Yen-Chun Chen, Yuwei Fang, Siqi Sun, Yu Cheng, Jingjing Liu
Transformer has become ubiquitous in the deep learning field.
1 code implementation • EMNLP 2020 • Shuohang Wang, Yuwei Fang, Siqi Sun, Zhe Gan, Yu Cheng, Jing Jiang, Jingjing Liu
In this paper, we propose Cross-Thought, a novel approach to pre-training sequence encoder, which is instrumental in building reusable sequence embeddings for large-scale NLP tasks such as question answering.
1 code implementation • EMNLP 2020 • Siqi Sun, Zhe Gan, Yu Cheng, Yuwei Fang, Shuohang Wang, Jingjing Liu
Existing language model compression methods mostly use a simple L2 loss to distill knowledge in the intermediate representations of a large BERT model to a smaller one.
no code implementations • 13 Sep 2020 • Shuohang Wang, Luowei Zhou, Zhe Gan, Yen-Chun Chen, Yuwei Fang, Siqi Sun, Yu Cheng, Jingjing Liu
Transformer has become ubiquitous in the deep learning field.
Ranked #1 on Open-Domain Question Answering on SearchQA
1 code implementation • 10 Sep 2020 • Yuwei Fang, Shuohang Wang, Zhe Gan, Siqi Sun, Jingjing Liu
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
Ranked #18 on Zero-Shot Cross-Lingual Transfer on XTREME
no code implementations • 10 Sep 2020 • Yuwei Fang, Shuohang Wang, Zhe Gan, Siqi Sun, Jingjing Liu, Chenguang Zhu
Although deep neural networks have achieved tremendous success for question answering (QA), they are still suffering from heavy computational and energy cost for real product deployment.
no code implementations • 31 Aug 2020 • Siqi Sun
Protein contacts provide key information for the understanding of protein structure and function, and therefore contact prediction from sequences is an important problem.
1 code implementation • ACL 2020 • Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, Bill Dolan
We present a large, tunable neural conversational response generation model, DIALOGPT (dialogue generative pre-trained transformer).
1 code implementation • EMNLP 2020 • Yuwei Fang, Siqi Sun, Zhe Gan, Rohit Pillai, Shuohang Wang, Jingjing Liu
In this paper, we present Hierarchical Graph Network (HGN) for multi-hop question answering.
Ranked #32 on Question Answering on HotpotQA
6 code implementations • 1 Nov 2019 • Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, Bill Dolan
We present a large, tunable neural conversational response generation model, DialoGPT (dialogue generative pre-trained transformer).
2 code implementations • ICLR 2020 • Chen Zhu, Yu Cheng, Zhe Gan, Siqi Sun, Tom Goldstein, Jingjing Liu
Adversarial training, which minimizes the maximal risk for label-preserving input perturbations, has proved to be effective for improving the generalization of language models.
3 code implementations • IJCNLP 2019 • Siqi Sun, Yu Cheng, Zhe Gan, Jingjing Liu
Pre-trained language models such as BERT have proven to be highly effective for natural language processing (NLP) tasks.
2 code implementations • 29 Jul 2018 • Xiujun Li, Yu Wang, Siqi Sun, Sarah Panda, Jingjing Liu, Jianfeng Gao
This proposal introduces a Dialogue Challenge for building end-to-end task-completion dialogue systems, with the goal of encouraging the dialogue research community to collaborate and benchmark on standard datasets and unified experimental environment.
1 code implementation • 2 Sep 2016 • Sheng Wang, Siqi Sun, Zhen Li, Renyu Zhang, Jinbo Xu
Using our predicted contacts as restraints, we can (ab initio) fold 208 of the 398 membrane proteins with TMscore>0. 5.
no code implementations • 9 Feb 2016 • Branislav Kveton, Hung Bui, Mohammad Ghavamzadeh, Georgios Theocharous, S. Muthukrishnan, Siqi Sun
Graphical models are a popular approach to modeling structured data but they are unsuitable for high-cardinality variables.
no code implementations • NeurIPS 2015 • Siqi Sun, Mladen Kolar, Jinbo Xu
Learning the structure of a probabilistic graphical models is a well studied problem in the machine learning community due to its importance in many applications.
no code implementations • 17 Nov 2015 • Sheng Wang, Siqi Sun, Jinbo Xu
Our experimental results confirm that maximum-AUC greatly outperforms the other two training methods on 8-state secondary structure prediction and disorder prediction since their label distributions are highly imbalanced and also have similar performance as the other two training methods on the solvent accessibility prediction problem which has three equally-distributed labels.
no code implementations • 12 Nov 2015 • Yuancheng Zhu, Zhe Liu, Siqi Sun
We present a framework for incorporating prior information into nonparametric estimation of graphical models.
no code implementations • 7 Mar 2015 • Qingming Tang, Siqi Sun, Jinbo Xu
Learning the network structure underlying data is an important problem in machine learning.