1 code implementation • COLING 2022 • Yiming Wang, Qianren Mao, Junnan Liu, Weifeng Jiang, Hongdong Zhu, JianXin Li
Labeling large amounts of extractive summarization data is often prohibitive expensive due to time, financial, and expertise constraints, which poses great challenges to incorporating summarization system in practical applications.
no code implementations • 18 Apr 2024 • Qian Li, Cheng Ji, Shu Guo, Yong Zhao, Qianren Mao, Shangguang Wang, Yuntao Wei, JianXin Li
Existing methods are limited by their neglect of the multiple entity pairs in one sentence sharing very similar contextual information (ie, the same text and image), resulting in increased difficulty in the MMRE task.
no code implementations • 22 Feb 2024 • Qi Hu, Weifeng Jiang, Haoran Li, ZiHao Wang, Jiaxin Bai, Qianren Mao, Yangqiu Song, Lixin Fan, JianXin Li
An entity can be involved in various knowledge graphs and reasoning on multiple KGs and answering complex queries on multi-source KGs is important in discovering knowledge cross graphs.
1 code implementation • 29 Oct 2023 • Qianren Mao, Shaobo Zhao, Jiarui Li, Xiaolei Gu, Shizhu He, Bo Li, JianXin Li
Pre-trained sentence representations are crucial for identifying significant sentences in unsupervised document extractive summarization.
1 code implementation • 10 Sep 2023 • Zhijun Chen, Hailong Sun, Wanhao Zhang, Chunyi Xu, Qianren Mao, Pengpeng Chen
In Neural-Hidden-CRF, we can capitalize on the powerful language model BERT or other deep models to provide rich contextual semantic knowledge to the latent ground truth sequence, and use the hidden CRF layer to capture the internal label dependencies.
2 code implementations • 20 May 2023 • Weifeng Jiang, Qianren Mao, Chenghua Lin, JianXin Li, Ting Deng, Weiyi Yang, Zheng Wang
Many text mining models are constructed by fine-tuning a large deep pre-trained language model (PLM) in downstream tasks.
1 code implementation • 6 Jun 2021 • Qianren Mao, Xi Li, Bang Liu, Shu Guo, Peng Hao, JianXin Li, Lihong Wang
These tokens or phrases may originate from primary fragmental textual pieces (e. g., segments) in the original text and are separated into different segments.
no code implementations • 29 May 2021 • Qianren Mao, Jiazheng Wang, Zheng Wang, Xi Li, Bo Li, JianXin Li
We meticulously analyze the corpus using well-known metrics, focusing on the style of the summaries and the complexity of the summarization task.
no code implementations • 28 May 2021 • Junnan Liu, Qianren Mao, Bang Liu, Hao Peng, Hongdong Zhu, JianXin Li
In this paper, we argue that this limitation can be overcome by a semi-supervised approach: consistency training which is to leverage large amounts of unlabeled data to improve the performance of supervised learning over a small corpus.