1 code implementation • Findings (NAACL) 2022 • Zhen Zhang, Wei Zhu, Jinfan Zhang, Peng Wang, Rize Jin, Tae-Sun Chung
In this work, we propose Patient and Confident Early Exiting BERT (PCEE-BERT), an off-the-shelf sample-dependent early exiting method that can work with different PLMs and can also work along with popular model compression methods.
no code implementations • EMNLP 2021 • Wei Zhu, Xiaoling Wang, Yuan Ni, Guotong Xie
From this observation, we use mutual learning to improve BERT’s early exiting performances, that is, we ask each exit of a multi-exit BERT to distill knowledge from each other.
no code implementations • NAACL (BioNLP) 2021 • Wei Zhu, Yilong He, Ling Chai, Yunxiao Fan, Yuan Ni, Guotong Xie, Xiaoling Wang
First a RoBERTa model is first applied to give a local ranking of the candidate sentences.
no code implementations • COLING 2022 • Yuhui Zuo, Wei Zhu, Guoyong GUET Cai
Since open social platforms allow for a large and continuous flow of unverified information, rumors can emerge unexpectedly and spread quickly.
no code implementations • 24 Mar 2024 • Zequan Liu, Jiawen Lyn, Wei Zhu, Xing Tian, Yvette Graham
Parameter-efficient fine-tuning (PEFT) is widely studied for its effectiveness and efficiency in the era of large language models.
1 code implementation • 4 Jan 2024 • Wei Zhu, Wenfeng Li, Xing Tian, Pengfei Wang, Xiaoling Wang, Jin Chen, Yuanbin Wu, Yuan Ni, Guotong Xie
In this work, we propose a novel task, Text2MDT, to explore the automatic extraction of MDTs from medical texts such as medical guidelines and textbooks.
1 code implementation • 1 Jan 2024 • Jiayou Chao, Wei Zhu
Recent advancements in deep neural networks have markedly enhanced the performance of computer vision tasks, yet the specialized nature of these networks often necessitates extensive data and high computational power.
1 code implementation • 29 Dec 2023 • Wei Zhu, Xiaoling Wang, Mosha Chen, Buzhou Tang
Many teams from both the industry and academia participated in the shared tasks, and the top teams achieved amazing test results.
no code implementations • 19 Dec 2023 • Hui Wu, Yi Gan, Feng Yuan, Jing Ma, Wei Zhu, Yutao Xu, Hong Zhu, Yuhua Zhu, Xiaoli Liu, Jinghui Gu
A customized Scaled-Dot-Product-Attention kernel is designed to match our fusion policy based on the segment KV cache solution.
no code implementations • 31 Oct 2023 • Wei Zhu, Ming Tan
Prompt tuning prepends a soft prompt to the input embeddings or hidden states and only optimizes the prompt to adapt pretrained models (PTMs) to downstream tasks.
1 code implementation • 22 Oct 2023 • Wei Zhu, Xiaoling Wang, Huanran Zheng, Mosha Chen, Buzhou Tang
Biomedical language understanding benchmarks are the driving forces for artificial intelligence applications with large language model (LLM) back-ends.
2 code implementations • 2 Oct 2023 • Ganqu Cui, Lifan Yuan, Ning Ding, Guanming Yao, Wei Zhu, Yuan Ni, Guotong Xie, Zhiyuan Liu, Maosong Sun
However, the scarcity of diverse, naturalistic datasets of human preferences on LLM outputs at scale poses a great challenge to RLHF as well as feedback learning research within the open-source community.
no code implementations • 29 May 2023 • Yi Huang, Wei Zhu, Duan Li, Shushang Zhu, Shikun Wang
Following the idea of Bayesian learning via Gaussian mixture model, we organically combine the backward-looking information contained in the historical data and the forward-looking information implied by the market portfolio, which is affected by heterogeneous expectations and noisy trading behavior.
no code implementations • 22 May 2023 • Ziyu Chen, Markos A. Katsoulakis, Luc Rey-Bellet, Wei Zhu
Group-invariant generative adversarial networks (GANs) are a type of GANs in which the generators and discriminators are hardwired with group symmetries.
no code implementations • 21 May 2023 • Xiangxiang Gao, Wei Zhu, Jiasheng Gao, Congrui Yin
Computational complexity and overthinking problems have become the bottlenecks for pre-training language models (PLMs) with millions or even trillions of parameters.
Multi-Label Classification Multi Label Text Classification +2
1 code implementation • 7 May 2023 • Xiaonan Li, Kai Lv, Hang Yan, Tianyang Lin, Wei Zhu, Yuan Ni, Guotong Xie, Xiaoling Wang, Xipeng Qiu
To train UDR, we cast various tasks' training signals into a unified list-wise ranking formulation by language model's feedback.
no code implementations • 15 Mar 2023 • Wei Zhu, Runtao Zhou, Yao Yuan, Campbell Timothy, Rajat Jain, Jiebo Luo
However, the shortage of annotated training data poses a severe problem in improving the performance and generalization ability of the trained model.
no code implementations • 3 Feb 2023 • Ziyu Chen, Markos A. Katsoulakis, Luc Rey-Bellet, Wei Zhu
We rigorously quantify the improvement in the sample complexity of variational divergence estimations for group-invariant distributions.
1 code implementation • 27 Jan 2023 • Huanran Zheng, Wei Zhu, Pengfei Wang, Xiaoling Wang
In this paper, we propose a simple but effective method called "Candidate Soups," which can obtain high-quality translations while maintaining the inference speed of NAT models.
no code implementations • 9 Aug 2022 • Yisong Yu, Beihong Jin, Jiageng Song, Beibei Li, Yiyuan Zheng, Wei Zhu
Although the micro-video recommendation can be naturally treated as the sequential recommendation, the previous sequential recommendation models do not fully consider the characteristics of micro-video apps, and in their inductive biases, the role of positions is not in accord with the reality in the micro-video scenario.
no code implementations • 26 Jul 2022 • Weijian Li, Wei Zhu, E. Ray Dorsey, Jiebo Luo
Medication for neurological diseases such as the Parkinson's disease usually happens remotely away from hospitals.
1 code implementation • 24 Jul 2022 • Yong Huang, Aderon Huang, Wei Zhu, Yanming Fang, Jinghua Feng
Then, in order to take full advantage of unlabeled datasets, we use self-supervised learning and supervised learning joint training to provide pre-trained model.
1 code implementation • 16 Jun 2022 • Haimeng Zhao, Wei Zhu
The key feature of MAGIC is the introduction of a neural controlled differential equation, which provides the capability to handle light curves with irregular sampling and large data gaps.
1 code implementation • CVPR 2022 • Wei Zhu, Le Lu, Jing Xiao, Mei Han, Jiebo Luo, Adam P. Harrison
Adversarial domain generalization is a popular approach to DG, but conventional approaches (1) struggle to sufficiently align features so that local neighborhoods are mixed across domains; and (2) can suffer from feature space over collapse which can threaten generalization performance.
no code implementations • 9 May 2022 • Wei Zhu, Dongjin Song, Yuncong Chen, Wei Cheng, Bo Zong, Takehiko Mizoguchi, Cristian Lumezanu, Haifeng Chen, Jiebo Luo
Specifically, we first design an Exemplar-based Deep Neural network (ExDNN) to learn local time series representations based on their compatibility with an exemplar module which consists of hidden parameters learned to capture varieties of normal patterns on each edge device.
1 code implementation • 16 Mar 2022 • Yuhui Zuo, Wei Zhu, Guoyong Cai
Since open social platforms allow for a large and continuous flow of unverified information, rumors can emerge unexpectedly and spread quickly.
1 code implementation • Findings (ACL) 2022 • Tianxiang Sun, Xiangyang Liu, Wei Zhu, Zhichao Geng, Lingling Wu, Yilong He, Yuan Ni, Guotong Xie, Xuanjing Huang, Xipeng Qiu
Previous works usually adopt heuristic metrics such as the entropy of internal outputs to measure instance difficulty, which suffers from generalization and threshold-tuning.
no code implementations • 2 Feb 2022 • Jeremiah Birrell, Markos A. Katsoulakis, Luc Rey-Bellet, Wei Zhu
Generative adversarial networks (GANs), a class of distribution-learning methods based on a two-player game between a generator and a discriminator, can generally be formulated as a minmax problem based on the variational representation of a divergence between the unknown and the generated distributions.
no code implementations • 21 Jan 2022 • Feng Ren, Xiao Ding, Min Zheng, Mikhail Korzinkin, Xin Cai, Wei Zhu, Alexey Mantsyzov, Alex Aliper, Vladimir Aladinskiy, Zhongying Cao, Shanshan Kong, Xi Long, Bonnie Hei Man Liu, Yingtao Liu, Vladimir Naumov, Anastasia Shneyderman, Ivan V. Ozerov, Ju Wang, Frank W. Pun, Alan Aspuru-Guzik, Michael Levitt, Alex Zhavoronkov
The AlphaFold computer program predicted protein structures for the whole human genome, which has been considered as a remarkable breakthrough both in artificial intelligence (AI) application and structural biology.
no code implementations • 22 Nov 2021 • Liyao Gao, Guang Lin, Wei Zhu
Incorporating group symmetry directly into the learning process has proved to be an effective guideline for model design.
no code implementations • 16 Nov 2021 • Pengzhan Guo, Keli Xiao, Zeyang Ye, Wei Zhu
Vehicle mobility optimization in urban areas is a long-standing problem in smart city and spatial data analysis.
no code implementations • 15 Sep 2021 • Wei Zhu, Zihe Zheng, Haitian Zheng, Hanjia Lyu, Jiebo Luo
The learned prototypes and their labels can be regarded as denoising features and labels for the local regions and can guide the training process to prevent the model from overfitting the noisy cases.
no code implementations • 15 Sep 2021 • Wei Zhu, Jiebo Luo, Andrew White
FLIT(+) can align the local training across heterogeneous clients by improving the performance for uncertain samples.
no code implementations • ICCV 2021 • Wei Zhu, Haitian Zheng, Haofu Liao, Weijian Li, Jiebo Luo
We propose to remove the bias information misused by the target task with a cross-sample adversarial debiasing (CSAD) method.
no code implementations • ACL 2021 • Wei Zhu
In this work, to improve efficiency without performance drop, we propose a novel training scheme called Learned Early Exit for BERT (LeeBERT).
no code implementations • ACL 2021 • Wei Zhu
Despite the development of pre-trained language models (PLMs) significantly raise the performances of various Chinese natural language processing (NLP) tasks, the vocabulary (vocab) for these Chinese PLMs remains to be the one provided by Google Chinese BERT (CITATION), which is based on Chinese characters (chars).
no code implementations • NAACL 2021 • Wei Zhu, Yuan Ni, Xiaoling Wang, Guotong Xie
In developing an online question-answering system for the medical domains, natural language inference (NLI) models play a central role in question matching and intention detection.
no code implementations • 25 Feb 2021 • Shengran Lin, Changfeng Weng, Yuanjie Yang, Jiaxin Zhao, Yuhang Guo, Jian Zhang, Liren Lou, Wei Zhu, Guanzhong Wang
Nitrogen-vacancy (NV) center in diamond is an ideal candidate for quantum sensors because of its excellent optical and coherence property.
Quantum Physics Mesoscale and Nanoscale Physics
no code implementations • 10 Jan 2021 • Min Shu, Ruiqiang Song, Wei Zhu
We employed the log-periodic power law singularity (LPPLS) methodology to systematically investigate the 2020 stock market crash in the U. S. equities sectors with different levels of total market capitalizations through four major U. S. stock market indexes, including the Wilshire 5000 Total Market index, the S&P 500 index, the S&P MidCap 400 index, and the Russell 2000 index, representing the stocks overall, the large capitalization stocks, the middle capitalization stocks and the small capitalization stocks, respectively.
no code implementations • 2 Jan 2021 • Wei Zhu, Daniel Cheung
In this work, we represent Lex-BERT, which incorporates the lexicon information into Chinese BERT for named entity recognition (NER) tasks in a natural manner.
no code implementations • 1 Jan 2021 • Ruiqiang Song, Min Shu, Wei Zhu
Starting on February 20, 2020, the global stock markets began to suffer the worst decline since the Great Recession in 2008, and the COVID-19 has been widely blamed on the stock market crashes.
no code implementations • 29 Dec 2020 • Wei Zhu, Daniel Cheung
In this work, we represent CMV-BERT, which improves the pretraining of a language model via two ingredients: (a) contrastive learning, which is well studied in the area of computer vision; (b) multiple vocabularies, one of which is fine-grained and the other is coarse-grained.
no code implementations • 17 Nov 2020 • Wei Zhu
Despite the development of pre-trained language models (PLMs) significantly raise the performances of various Chinese natural language processing (NLP) tasks, the vocabulary for these Chinese PLMs remain to be the one provided by Google Chinese Bert \cite{devlin2018bert}, which is based on Chinese characters.
no code implementations • 15 Nov 2020 • Jiaju Miao, Wei Zhu
Our algorithm, named as the "Precision-Recall Curve classification tree", or simply the "PRC classification tree" modifies two crucial stages in tree building.
no code implementations • 25 Sep 2020 • Weijian Li, Wei Zhu, E. Ray Dorsey, Jiebo Luo
Parkinsons Disease is a neurological disorder and prevalent in elderly people.
no code implementations • ACL 2021 • Wei Zhu, Xipeng Qiu, Yuan Ni, Guotong Xie
Ablation study demonstrates the necessity of our search space design and the effectiveness of our search method.
3 code implementations • 4 Sep 2020 • Wei Zhu, Xiaoling Wang, Xipeng Qiu, Yuan Ni, Guotong Xie
Though the transformer architectures have shown dominance in many natural language understanding tasks, there are still unsolved issues for the training of transformer models, especially the need for a principled way of warm-up which has shown importance for stable training of a transformer, as well as whether the task at hand prefer to scale the attention product or not.
no code implementations • 25 May 2020 • Haitian Zheng, Kefei Wu, Jong-Hwi Park, Wei Zhu, Jiebo Luo
In this work, we study the problem of personalized fashion recommendation from social media data, i. e. recommending new outfits to social media users that fit their fashion preferences.
no code implementations • 21 Apr 2020 • Wei Zhu, Haofu Liao, Wenbin Li, Weijian Li, Jiebo Luo
Inspired by the recent success of Few-Shot Learning (FSL) in natural image classification, we propose to apply FSL to skin disease identification to address the extreme scarcity of training sample problem.
no code implementations • 7 Apr 2020 • Pengzhan Guo, Zeyang Ye, Keli Xiao, Wei Zhu
Following a theoretical analysis on the characteristics of the new objective function, WASGD introduces a decentralized weighted aggregating scheme based on the performance of local workers.
no code implementations • 8 Mar 2020 • Yang Feng, Futang Peng, Xu Zhang, Wei Zhu, Shanfeng Zhang, Howard Zhou, Zhen Li, Tom Duerig, Shih-Fu Chang, Jiebo Luo
Therefore, we propose to distill the knowledge in multiple specialists into a universal embedding to solve this problem.
no code implementations • WS 2019 • Xiepeng Li, Zhexi Zhang, Wei Zhu, Zheng Li, Yuan Ni, Peng Gao, Junchi Yan, Guotong Xie
We have experimented both (a) improving the fine-tuning of pre-trained language models on a task with a small dataset size, by leveraging datasets of similar tasks; and (b) incorporating the distributional representations of a KG onto the representations of pre-trained language models, via simply concatenation or multi-head attention.
Ranked #17 on Common Sense Reasoning on ReCoRD
no code implementations • 25 Sep 2019 • Wei Zhu, Qiang Qiu, Robert Calderbank, Guillermo Sapiro, Xiuyuan Cheng
Encoding the input scale information explicitly into the representation learned by a convolutional neural network (CNN) is beneficial for many vision tasks especially when dealing with multiscale input signals.
no code implementations • 24 Sep 2019 • Wei Zhu, Qiang Qiu, Robert Calderbank, Guillermo Sapiro, Xiuyuan Cheng
Encoding the scale information explicitly into the representation learned by a convolutional neural network (CNN) is beneficial for many computer vision tasks especially when dealing with multiscale inputs.
no code implementations • WS 2019 • Wei Zhu, Xiaofeng Zhou, Keqiang Wang, Xun Luo, Xiepeng Li, Yuan Ni, Guotong Xie
Transfer learning from the NLI task to the RQE task is also experimented, which proves to be useful in improving the results of fine-tuning MT-DNN large.
no code implementations • 7 Jun 2019 • Min Shu, Wei Zhu
The new model holds the merit of the existing rough set extension models while avoids their limitations of discarding transitivity or symmetry.
1 code implementation • 23 Sep 2018 • Bao Wang, Alex T. Lin, Wei Zhu, Penghang Yin, Andrea L. Bertozzi, Stanley J. Osher
We improve the robustness of Deep Neural Net (DNN) to adversarial attacks by using an interpolating function as the output activation.
1 code implementation • 4 Jul 2018 • Alexis Chacón, Dasol Kim, Wei Zhu, Shane P. Kelly, Alexandre Dauphin, Emilio Pisanty, Andrew S. Maxwell, Antonio Picón, Marcelo F. Ciappina, Dong Eon Kim, Christopher Ticknor, Avadh Saxena, Maciej Lewenstein
Topological materials are of interest to both fundamental science and advanced technologies, because topological states are robust with respect to perturbations and dissipation.
Mesoscale and Nanoscale Physics Quantum Physics
no code implementations • ICLR 2019 • Wei Zhu, Qiang Qiu, Bao Wang, Jianfeng Lu, Guillermo Sapiro, Ingrid Daubechies
Deep neural networks (DNNs) typically have enough capacity to fit random data by brute force even when conventional data-dependent regularizations focusing on the geometry of the features are imposed.
5 code implementations • arXiv 2018 • Patrick J. Coles, Stephan Eidenbenz, Scott Pakin, Adetokunbo Adedoyin, John Ambrosiano, Petr Anisimov, William Casper, Gopinath Chennupati, Carleton Coffrin, Hristo Djidjev, David Gunter, Satish Karra, Nathan Lemons, Shizeng Lin, Andrey Lokhov, Alexander Malyzhenkov, David Mascarenas, Susan Mniszewski, Balu Nadiga, Dan O'Malley, Diane Oyen, Lakshman Prasad, Randy Roberts, Phil Romero, Nandakishore Santhi, Nikolai Sinitsyn, Pieter Swart, Marc Vuffray, Jim Wendelberger, Boram Yoon, Richard Zamora, Wei Zhu
As quantum computers become available to the general public, the need has arisen to train a cohort of quantum programmers, many of whom have been developing classical computer programs for most of their careers.
Emerging Technologies Quantum Physics
1 code implementation • NeurIPS 2018 • Bao Wang, Xiyang Luo, Zhen Li, Wei Zhu, Zuoqiang Shi, Stanley J. Osher
We replace the output layer of deep neural nets, typically the softmax function, by a novel interpolating function.
no code implementations • 14 Dec 2017 • Longtao Chen, Jing Lou, Wei Zhu, Qingyuan Xia, Mingwu Ren
Aiming to address the fast multi-object tracking for dense small object in the cluster background, we review track orientated multi-hypothesis tracking(TOMHT) with consideration of batch optimization.
no code implementations • CVPR 2018 • Wei Zhu, Qiang Qiu, Jiaji Huang, Robert Calderbank, Guillermo Sapiro, Ingrid Daubechies
To resolve this, we propose a new framework, the Low-Dimensional-Manifold-regularized neural Network (LDMNet), which incorporates a feature regularization method that focuses on the geometry of both the input data and the output features.
no code implementations • 12 Sep 2017 • Tao Sun, Hao Jiang, Li-Zhi Cheng, Wei Zhu
In fact, a lot of classical inexact nonconvex and nonsmooth algorithms allow these three conditions.
no code implementations • 1 Sep 2017 • Tao Sun, Hao Jiang, Lizhi Cheng, Wei Zhu
The traditional alternating direction method of multipliers encounters troubles in both mathematics and computations in solving the nonconvex and nonsmooth subproblem.
no code implementations • 27 Mar 2017 • Jing Lou, Huan Wang, Longtao Chen, Fenglei Xu, Qingyuan Xia, Wei Zhu, Mingwu Ren
In this paper, we will investigate the contribution of color names for the task of salient object detection.
no code implementations • 18 May 2016 • Wei Zhu, Zuoqiang Shi, Stanley Osher
We present a scalable low dimensional manifold model for the reconstruction of noisy and incomplete hyperspectral images.
no code implementations • 27 Apr 2016 • Wei Zhu, Victoria Chayes, Alexandre Tiard, Stephanie Sanchez, Devin Dahlberg, Andrea L. Bertozzi, Stanley Osher, Dominique Zosso, Da Kuang
In this paper, a graph-based nonlocal total variation method (NLTV) is proposed for unsupervised classification of hyperspectral images (HSI).
no code implementations • 12 Aug 2015 • Hao Han, Wei Zhu
The errors-in-variables (EIV) regression model, being more realistic by accounting for measurement errors in both the dependent and the independent variables, is widely adopted in applied sciences.