Search Results for author: Yang Yu

Found 184 papers, 61 papers with code

Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation

1 code implementation12 Mar 2024 Chengxing Jia, Fuxiang Zhang, Yi-Chen Li, Chen-Xiao Gao, Xu-Hui Liu, Lei Yuan, Zongzhang Zhang, Yang Yu

Specifically, the objective of adversarial data augmentation is not merely to generate data analogous to offline data distribution; instead, it aims to create adversarial examples designed to confound learned task representations and lead to incorrect task identification.

Contrastive Learning Data Augmentation +3

Prejudice and Caprice: A Statistical Framework for Measuring Social Discrimination in Large Language Models

no code implementations23 Feb 2024 Yiran Liu, Ke Yang, Zehan Qi, Xiao Liu, Yang Yu, ChengXiang Zhai

The growing integration of large language models (LLMs) into social operations amplifies their impact on decisions in crucial areas such as economics, law, education, and healthcare, raising public concerns about these models' discrimination-related safety and reliability.

Attribute Sentence

Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics

no code implementations17 Feb 2024 Xinyu Zhang, Wenjie Qiu, Yi-Chen Li, Lei Yuan, Chengxing Jia, Zongzhang Zhang, Yang Yu

DORA incorporates an information bottleneck principle that maximizes mutual information between the dynamics encoding and the environmental data, while minimizing mutual information between the dynamics encoding and the actions of the behavior policy.

Representation Learning

Learning by Doing: An Online Causal Reinforcement Learning Framework with Causal-Aware Policy

no code implementations7 Feb 2024 Ruichu Cai, Siyang Huang, Jie Qiao, Wei Chen, Yan Zeng, Keli Zhang, Fuchun Sun, Yang Yu, Zhifeng Hao

As a key component to intuitive cognition and reasoning solutions in human intelligence, causal knowledge provides great potential for reinforcement learning (RL) agents' interpretability towards decision-making by helping reduce the searching space.

Decision Making Reinforcement Learning (RL)

Empowering Language Models with Active Inquiry for Deeper Understanding

no code implementations6 Feb 2024 Jing-Cheng Pang, Heng-Bo Fan, Pengyuan Wang, Jia-Hao Xiao, Nan Tang, Si-Hang Yang, Chengxing Jia, Sheng-Jun Huang, Yang Yu

The rise of large language models (LLMs) has revolutionized the way that we interact with artificial intelligence systems through natural language.

Active Learning Language Modelling +1

FedGT: Federated Node Classification with Scalable Graph Transformer

no code implementations26 Jan 2024 Zaixi Zhang, Qingyong Hu, Yang Yu, Weibo Gao, Qi Liu

However, existing methods have the following limitations: (1) The links between local subgraphs are missing in subgraph federated learning.

Classification Federated Learning +2

Beimingwu: A Learnware Dock System

1 code implementation24 Jan 2024 Zhi-Hao Tan, Jian-Dong Liu, Xiao-Dong Bi, Peng Tan, Qin-Cheng Zheng, Hai-Tian Liu, Yi Xie, Xiao-Chuan Zou, Yang Yu, Zhi-Hua Zhou

The learnware paradigm proposed by Zhou [2016] aims to enable users to reuse numerous existing well-trained models instead of building machine learning models from scratch, with the hope of solving new user tasks even beyond models' original purposes.

Policy Optimization in RLHF: The Impact of Out-of-preference Data

1 code implementation17 Dec 2023 Ziniu Li, Tian Xu, Yang Yu

These methods, either explicitly or implicitly, learn a reward model from preference data and differ in the data used for policy optimization to unlock the generalization ability of the reward model.

Episodic Return Decomposition by Difference of Implicitly Assigned Sub-Trajectory Reward

1 code implementation17 Dec 2023 Haoxin Lin, Hongqiu Wu, Jiaji Zhang, Yihao Sun, Junyin Ye, Yang Yu

Real-world decision-making problems are usually accompanied by delayed rewards, which affects the sample efficiency of Reinforcement Learning, especially in the extremely delayed case where the only feedback is the episodic reward obtained at the end of an episode.

Decision Making

Efficient Human-AI Coordination via Preparatory Language-based Convention

no code implementations1 Nov 2023 Cong Guan, Lichao Zhang, Chunpeng Fan, Yichen Li, Feng Chen, Lihe Li, Yunjia Tian, Lei Yuan, Yang Yu

Developing intelligent agents capable of seamless coordination with humans is a critical step towards achieving artificial general intelligence.

Language Modelling Large Language Model

AdaptSSR: Pre-training User Model with Augmentation-Adaptive Self-Supervised Ranking

1 code implementation NeurIPS 2023 Yang Yu, Qi Liu, Kai Zhang, Yuren Zhang, Chao Song, Min Hou, Yuqing Yuan, Zhihao Ye, Zaixi Zhang, Sanshi Lei Yu

Specifically, we adopt a multiple pairwise ranking loss which trains the user model to capture the similarity orders between the implicitly augmented view, the explicitly augmented view, and views from other users.

Contrastive Learning Data Augmentation

Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning

no code implementations9 Oct 2023 Fan-Ming Luo, Tian Xu, Xingchen Cao, Yang Yu

MOREC learns a generalizable dynamics reward function from offline data, which is subsequently employed as a transition filter in any offline MBRL method: when generating transitions, the dynamics model generates a batch of transitions and selects the one with the highest dynamics reward value.

D4RL Model-based Reinforcement Learning +1

Improve the efficiency of deep reinforcement learning through semantic exploration guided by natural language

no code implementations21 Sep 2023 Zhourui Guo, Meng Yao, Yang Yu, Qiyue Yin

We assume that the interaction can be modeled as a sequence of templated questions and answers, and that there is a large corpus of previous interactions available.

ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning

1 code implementation12 Sep 2023 Chen-Xiao Gao, Chenyang Wu, Mingjun Cao, Rui Kong, Zongzhang Zhang, Yang Yu

Third, we train an Advantage-Conditioned Transformer (ACT) to generate actions conditioned on the estimated advantages.

Action Generation

Hide and Seek (HaS): A Lightweight Framework for Prompt Privacy Protection

1 code implementation6 Sep 2023 Yu Chen, Tingxin Li, Huiming Liu, Yang Yu

Numerous companies have started offering services based on large language models (LLM), such as ChatGPT, which inevitably raises privacy concerns as users' prompts are exposed to the model provider.

Sensiverse: A dataset for ISAC study

no code implementations26 Aug 2023 Jiajin Luo, Baojian Zhou, Yang Yu, Ping Zhang, Xiaohui Peng, Jianglei Ma, Peiying Zhu, Jianmin Lu, Wen Tong

In order to address the lack of applicable channel models for ISAC research and evaluation, we release Sensiverse, a dataset that can be used for ISAC research.

Synergistic Signal Denoising for Multimodal Time Series of Structure Vibration

no code implementations17 Aug 2023 Yang Yu, Han Chen

Structural Health Monitoring (SHM) plays an indispensable role in ensuring the longevity and safety of infrastructure.

Denoising Time Series

Transformer-Based Denoising of Mechanical Vibration Signals

no code implementations4 Aug 2023 Han Chen, Yang Yu, Pengtao Li

Mechanical vibration signal denoising is a pivotal task in various industrial applications, including system health monitoring and failure prediction.

Denoising

Disentangling Multi-view Representations Beyond Inductive Bias

1 code implementation3 Aug 2023 Guanzhou Ke, Yang Yu, Guoqing Chao, Xiaoli Wang, Chenyang Xu, Shengfeng He

In this paper, we propose a novel multi-view representation disentangling method that aims to go beyond inductive biases, ensuring both interpretability and generalizability of the resulting representations.

Clustering Inductive Bias +2

Car-Studio: Learning Car Radiance Fields from Single-View and Endless In-the-wild Images

1 code implementation26 Jul 2023 Tianyu Liu, Hao Zhao, Yang Yu, Guyue Zhou, Ming Liu

However, previous studies learned within a sequence of autonomous driving datasets, resulting in unsatisfactory blurring when rotating the car in the simulator.

Autonomous Driving

Model-Bellman Inconsistency for Model-based Offline Reinforcement Learning

2 code implementations PMLR 2023 Yihao Sun, Jiaji Zhang, Chengxing Jia, Haoxin Lin, Junyin Ye, Yang Yu

MOBILE conducts uncertainty quantification through the inconsistency of Bellman estimations under an ensemble of learned dynamics models, which can be a better approximator to the true Bellman error, and penalizes the Bellman estimation based on this uncertainty.

D4RL Offline RL +3

A Unified View of Deep Learning for Reaction and Retrosynthesis Prediction: Current Status and Future Challenges

no code implementations28 Jun 2023 Ziqiao Meng, Peilin Zhao, Yang Yu, Irwin King

Reaction and retrosynthesis prediction are fundamental tasks in computational chemistry that have recently garnered attention from both the machine learning and drug discovery communities.

Drug Discovery Retrosynthesis

SplatFlow: Learning Multi-frame Optical Flow via Splatting

1 code implementation15 Jun 2023 Bo wang, Yifan Zhang, Jian Li, Yang Yu, Zhenping Sun, Li Liu, Dewen Hu

The occlusion problem remains a crucial challenge in optical flow estimation (OFE).

Optical Flow Estimation

NPVForensics: Jointing Non-critical Phonemes and Visemes for Deepfake Detection

no code implementations12 Jun 2023 Yu Chen, Yang Yu, Rongrong Ni, Yao Zhao, Haoliang Li

Next, we design a phoneme-viseme awareness module for cross-modal feature fusion and representation alignment, so that the modality gap can be reduced and the intrinsic complementarity of the two modalities can be better explored.

DeepFake Detection Face Swapping

Policy Regularization with Dataset Constraint for Offline Reinforcement Learning

2 code implementations11 Jun 2023 Yuhang Ran, Yi-Chen Li, Fuxiang Zhang, Zongzhang Zhang, Yang Yu

A common taxonomy of existing offline RL works is policy regularization, which typically constrains the learned policy by distribution or support of the behavior policy.

Offline RL reinforcement-learning +1

Provably Efficient Adversarial Imitation Learning with Unknown Transitions

1 code implementation11 Jun 2023 Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo

Adversarial imitation learning (AIL), a subset of IL methods, is particularly promising, but its theoretical foundation in the presence of unknown transitions has yet to be fully developed.

Imitation Learning

Doubly Stochastic Graph-based Non-autoregressive Reaction Prediction

no code implementations5 Jun 2023 Ziqiao Meng, Peilin Zhao, Yang Yu, Irwin King

However, the current non-autoregressive decoder does not satisfy two essential rules of electron redistribution modeling simultaneously: the electron-counting rule and the symmetry rule.

Drug Discovery

Language Model Self-improvement by Reinforcement Learning Contemplation

no code implementations23 May 2023 Jing-Cheng Pang, Pengyuan Wang, Kaiyuan Li, Xiong-Hui Chen, Jiacheng Xu, Zongzhang Zhang, Yang Yu

We demonstrate that SIRLC can be applied to various NLP tasks, such as reasoning problems, text generation, and machine translation.

Language Modelling Machine Translation +3

Robust multi-agent coordination via evolutionary generation of auxiliary adversarial attackers

1 code implementation10 May 2023 Lei Yuan, Zi-Qian Zhang, Ke Xue, Hao Yin, Feng Chen, Cong Guan, Li-He Li, Chao Qian, Yang Yu

Concretely, to avoid the ego-system overfitting to a specific attacker, we maintain a set of attackers, which is optimized to guarantee the attackers high attacking quality and behavior diversity.

SMAC+

Communication-Robust Multi-Agent Learning by Adaptable Auxiliary Multi-Agent Adversary Generation

no code implementations9 May 2023 Lei Yuan, Feng Chen, Zhongzhang Zhang, Yang Yu

In specific, we introduce a novel message-attacking approach that models the learning of the auxiliary attacker as a cooperative problem under a shared goal to minimize the coordination ability of the ego system, with which every information channel may suffer from distinct message attacks.

Multi-agent Reinforcement Learning

Robust Multi-agent Communication via Multi-view Message Certification

no code implementations7 May 2023 Lei Yuan, Tao Jiang, Lihe Li, Feng Chen, Zongzhang Zhang, Yang Yu

Many multi-agent scenarios require message sharing among agents to promote coordination, hastening the robustness of multi-agent communication when policies are deployed in a message perturbation environment.

Multi-agent Continual Coordination via Progressive Task Contextualization

no code implementations7 May 2023 Lei Yuan, Lihe Li, Ziqian Zhang, Fuxiang Zhang, Cong Guan, Yang Yu

Towards tackling the mentioned issue, this paper proposes an approach Multi-Agent Continual Coordination via Progressive Task Contextualization, dubbed MACPro.

Continual Learning Multi-agent Reinforcement Learning

Sim2Rec: A Simulator-based Decision-making Approach to Optimize Real-World Long-term User Engagement in Sequential Recommender Systems

1 code implementation3 May 2023 Xiong-Hui Chen, Bowei He, Yang Yu, Qingyang Li, Zhiwei Qin, Wenjie Shang, Jieping Ye, Chen Ma

However, building a user simulator with no reality-gap, i. e., can predict user's feedback exactly, is unrealistic because the users' reaction patterns are complex and historical logs for each user are limited, which might mislead the simulator-based recommendation policy.

Decision Making Recommendation Systems +1

Adaptive Negative Evidential Deep Learning for Open-set Semi-supervised Learning

no code implementations21 Mar 2023 Yang Yu, Danruo Deng, Furui Liu, Yueming Jin, Qi Dou, Guangyong Chen, Pheng-Ann Heng

Open-set semi-supervised learning (Open-set SSL) considers a more practical scenario, where unlabeled data and test data contain new categories (outliers) not observed in labeled data (inliers).

Outlier Detection

Beware of Instantaneous Dependence in Reinforcement Learning

no code implementations9 Mar 2023 Zhengmao Zhu, YuRen Liu, Honglong Tian, Yang Yu, Kun Zhang

Playing an important role in Model-Based Reinforcement Learning (MBRL), environment models aim to predict future states based on the past.

Model-based Reinforcement Learning reinforcement-learning +1

How To Guide Your Learner: Imitation Learning with Active Adaptive Expert Involvement

1 code implementation3 Mar 2023 Xu-Hui Liu, Feng Xu, Xinyu Zhang, Tianyuan Liu, Shengyi Jiang, Ruifeng Chen, Zongzhang Zhang, Yang Yu

In this paper, we propose a novel active imitation learning framework based on a teacher-student interaction model, in which the teacher's goal is to identify the best teaching behavior and actively affect the student's learning process.

Atari Games Imitation Learning

Uncertainty Estimation by Fisher Information-based Evidential Deep Learning

1 code implementation3 Mar 2023 Danruo Deng, Guangyong Chen, Yang Yu, Furui Liu, Pheng-Ann Heng

To address this problem, we propose a novel method, Fisher Information-based Evidential Deep Learning ($\mathcal{I}$-EDL).

Informativeness Representation Learning

Natural Language-conditioned Reinforcement Learning with Inside-out Task Language Development and Translation

no code implementations18 Feb 2023 Jing-Cheng Pang, Xin-Yu Yang, Si-Hang Yang, Yang Yu

To ease the learning burden of the policy, we investigate an inside-out scheme for natural language-conditioned RL by developing a task language (TL) that is task-related and unique.

Instruction Following Reinforcement Learning (RL)

Theoretical Analysis of Offline Imitation With Supplementary Dataset

1 code implementation27 Jan 2023 Ziniu Li, Tian Xu, Yang Yu, Zhi-Quan Luo

This paper considers a situation where, besides the small amount of expert data, a supplementary dataset is available, which can be collected cheaply from sub-optimal policies.

Imitation Learning

Self-Motivated Multi-Agent Exploration

1 code implementation5 Jan 2023 Shaowei Zhang, Jiahan Cao, Lei Yuan, Yang Yu, De-Chuan Zhan

In cooperative multi-agent reinforcement learning (CMARL), it is critical for agents to achieve a balance between self-exploration and team collaboration.

SMAC+ Starcraft +1

A Clustering-guided Contrastive Fusion for Multi-view Representation Learning

1 code implementation28 Dec 2022 Guanzhou Ke, Guoqing Chao, Xiaoli Wang, Chenyang Xu, Yongqi Zhu, Yang Yu

To this end, we utilize a deep fusion network to fuse view-specific representations into the view-common representation, extracting high-level semantics for obtaining robust representation.

Clustering MULTI-VIEW LEARNING +1

Untargeted Attack against Federated Recommendation Systems via Poisonous Item Embeddings and the Defense

1 code implementation11 Dec 2022 Yang Yu, Qi Liu, Likang Wu, Runlong Yu, Sanshi Lei Yu, Zaixi Zhang

Experiments on two public datasets show that ClusterAttack can effectively degrade the performance of FedRec systems while circumventing many defense methods, and UNION can improve the resistance of the system against various untargeted attacks, including our ClusterAttack.

Contrastive Learning Recommendation Systems

Momentum Calibration for Text Generation

no code implementations8 Dec 2022 Xingxing Zhang, Yiran Liu, Xun Wang, Pengcheng He, Yang Yu, Si-Qing Chen, Wayne Xiong, Furu Wei

The input and output of most text generation tasks can be transformed to two sequences of tokens and they can be modeled using sequence-to-sequence learning modeling tools such as Transformers.

Abstractive Text Summarization Text Generation

Learning Physically Realizable Skills for Online Packing of General 3D Shapes

1 code implementation5 Dec 2022 Hang Zhao, Zherong Pan, Yang Yu, Kai Xu

We study the problem of learning online packing skills for irregular 3D shapes, which is arguably the most challenging setting of bin packing problems.

Action Generation Reinforcement Learning (RL)

Real-time Blind Deblurring Based on Lightweight Deep-Wiener-Network

no code implementations29 Nov 2022 Runjia Li, Yang Yu, Charlie Haywood

In this paper, we address the problem of blind deblurring with high efficiency.

Deblurring

Does Debiasing Inevitably Degrade the Model Performance

no code implementations14 Nov 2022 Yiran Liu, Xiao Liu, Haotian Chen, Yang Yu

We use our theoretical framework to explain why the current debiasing methods cause performance degradation.

Knowledge is Power: Understanding Causality Makes Legal judgment Prediction Models More Generalizable and Robust

no code implementations6 Nov 2022 Haotian Chen, Lingwei Zhang, Yiran Liu, Fanchao Chen, Yang Yu

To validate our theoretical analysis, we further propose another method using our proposed Causality-Aware Self-Attention Mechanism (CASAM) to guide the model to learn the underlying causality knowledge in legal texts.

Open Information Extraction

Semantic Structure Enhanced Contrastive Adversarial Hash Network for Cross-media Representation Learning

2 code implementations ACM Multimedia 2022 Meiyu Liang, Junping Du, Xiaowen Cao, Yang Yu, Kangkang Lu, Zhe Xue, Min Zhang

Secondly, for further improving learning ability of implicit cross-media semantic associations, a semantic label association graph is constructed, and the graph convolutional network is utilized to mine the implicit semantic structures, thus guiding learning of discriminative features of different modalities.

Representation Learning

Domain generalization Person Re-identification on Attention-aware multi-operation strategery

no code implementations19 Oct 2022 Yingchun Guo, Huan He, Ye Zhu, Yang Yu

Domain generalization person re-identification (DG Re-ID) aims to directly deploy a model trained on the source domain to the unseen target domain with good generalization, which is a challenging problem and has practical value in a real-world deployment.

Domain Generalization Person Re-Identification

Multi-agent Dynamic Algorithm Configuration

1 code implementation13 Oct 2022 Ke Xue, Jiacheng Xu, Lei Yuan, Miqing Li, Chao Qian, Zongzhang Zhang, Yang Yu

MA-DAC formulates the dynamic configuration of a complex algorithm with multiple types of hyperparameters as a contextual multi-agent Markov decision process and solves it by a cooperative multi-agent RL (MARL) algorithm.

Multi-Armed Bandits Reinforcement Learning (RL)

Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems

no code implementations11 Oct 2022 Zhengbang Zhu, Rongjun Qin, JunJie Huang, Xinyi Dai, Yang Yu, Yong Yu, Weinan Zhang

The increase in the measured performance, however, can have two possible attributions: a better understanding of user preferences, and a more proactive ability to utilize human bounded rationality to seduce user over-consumption.

Benchmarking Sequential Recommendation

MARS: A Motif-based Autoregressive Model for Retrosynthesis Prediction

no code implementations27 Sep 2022 Jiahan Liu, Chaochao Yan, Yang Yu, Chan Lu, Junzhou Huang, Le Ou-Yang, Peilin Zhao

In this paper, we propose a novel end-to-end graph generation model for retrosynthesis prediction, which sequentially identifies the reaction center, generates the synthons, and adds motifs to the synthons to generate reactants.

Drug Discovery Graph Generation +1

WeLM: A Well-Read Pre-trained Language Model for Chinese

no code implementations21 Sep 2022 Hui Su, Xiao Zhou, Houjin Yu, Xiaoyu Shen, YuWen Chen, Zilin Zhu, Yang Yu, Jie zhou

Large Language Models pre-trained with self-supervised learning have demonstrated impressive zero-shot generalization capabilities on a wide spectrum of tasks.

Language Modelling Self-Supervised Learning +2

Model-based Reinforcement Learning with Multi-step Plan Value Estimation

1 code implementation12 Sep 2022 Haoxin Lin, Yihao Sun, Jiaji Zhang, Yang Yu

The new model-based reinforcement learning algorithm MPPVE (Model-based Planning Policy Learning with Multi-step Plan Value Estimation) shows a better utilization of the learned model and achieves a better sample efficiency than state-of-the-art model-based RL approaches.

Model-based Reinforcement Learning reinforcement-learning +1

Deep Anomaly Detection and Search via Reinforcement Learning

no code implementations31 Aug 2022 Chao Chen, Dawei Wang, Feng Mao, Zongzhang Zhang, Yang Yu

Semi-supervised Anomaly Detection (AD) is a kind of data mining task which aims at learning features from partially-labeled datasets to help detect outliers.

Ensemble Learning Partially Labeled Datasets +4

MORI-RAN: Multi-view Robust Representation Learning via Hybrid Contrastive Fusion

1 code implementation26 Aug 2022 Guanzhou Ke, Yongqi Zhu, Yang Yu

To this end, in this paper, we proposed a hybrid contrastive fusion algorithm to extract robust view-common representation from unlabeled data.

Clustering Representation Learning +1

Convolutional Neural Networks with A Topographic Representation Module for EEG-Based Brain-Computer Interfaces

no code implementations23 Aug 2022 Xinbin Liang, Yaru Liu, Yang Yu, Kaixuan Liu, Yadong Liu, Zongtan Zhou

Significance: We improve the classification performance of 3 CNNs on 2 datasets by the use of TRM, indicating that it has the capability to mine the EEG spatial topological information.

Classification EEG +1

Heterogeneous Multi-agent Zero-Shot Coordination by Coevolution

no code implementations9 Aug 2022 Ke Xue, Yutong Wang, Cong Guan, Lei Yuan, Haobo Fu, Qiang Fu, Chao Qian, Yang Yu

Generating agents that can achieve zero-shot coordination (ZSC) with unseen partners is a new challenge in cooperative multi-agent reinforcement learning (MARL).

Multi-agent Reinforcement Learning

Pseudo-label Guided Cross-video Pixel Contrast for Robotic Surgical Scene Segmentation with Limited Annotations

no code implementations20 Jul 2022 Yang Yu, Zixu Zhao, Yueming Jin, Guangyong Chen, Qi Dou, Pheng-Ann Heng

Concretely, for trusty representation learning, we propose to incorporate pseudo labels to instruct the pair selection, obtaining more reliable representation pairs for pixel contrast.

Pseudo Label Representation Learning +2

Hybrid Value Estimation for Off-policy Evaluation and Offline Reinforcement Learning

no code implementations4 Jun 2022 Xue-Kun Jin, Xu-Hui Liu, Shengyi Jiang, Yang Yu

Value function estimation is an indispensable subroutine in reinforcement learning, which becomes more challenging in the offline setting.

Off-policy evaluation reinforcement-learning

Offline Reinforcement Learning with Causal Structured World Models

no code implementations3 Jun 2022 Zheng-Mao Zhu, Xiong-Hui Chen, Hong-Long Tian, Kun Zhang, Yang Yu

Model-based methods have recently shown promising for offline reinforcement learning (RL), aiming to learn good policies from historical data without interacting with the environment.

Model-based Reinforcement Learning Offline RL +2

Transferable Reward Learning by Dynamics-Agnostic Discriminator Ensemble

no code implementations1 Jun 2022 Fan-Ming Luo, Xingchen Cao, Yang Yu

Empirical results compared with the state-of-the-art AIL methods show that DARL can learn a reward that is more consistent with the true reward, thus obtaining higher environment returns.

Imitation Learning

Model Generation with Provable Coverability for Offline Reinforcement Learning

no code implementations1 Jun 2022 Chengxing Jia, Hao Yin, Chenxiao Gao, Tian Xu, Lei Yuan, Zongzhang Zhang, Yang Yu

Model-based offline optimization with dynamics-aware policy provides a new perspective for policy learning and out-of-distribution generalization, where the learned policy could adapt to different dynamics enumerated at the training stage.

Offline RL Out-of-Distribution Generalization +2

Exploring Intra- and Inter-Video Relation for Surgical Semantic Scene Segmentation

1 code implementation29 Mar 2022 Yueming Jin, Yang Yu, Cheng Chen, Zixu Zhao, Pheng-Ann Heng, Danail Stoyanov

Automatic surgical scene segmentation is fundamental for facilitating cognitive intelligence in the modern operating theatre.

Contrastive Learning Relation +1

Enhancing Neural Mathematical Reasoning by Abductive Combination with Symbolic Library

no code implementations28 Mar 2022 Yangyang Hu, Yang Yu

On a mathematical reasoning dataset, we adopt the recently proposed abductive learning framework, and propose the ABL-Sym algorithm that combines the Transformer neural models with a symbolic mathematics library.

Logical Reasoning Mathematical Reasoning +1

A Note on Target Q-learning For Solving Finite MDPs with A Generative Oracle

no code implementations22 Mar 2022 Ziniu Li, Tian Xu, Yang Yu

In particular, we demonstrate that the sample complexity of the target Q-learning algorithm in [Lee and He, 2020] is $\widetilde{\mathcal O}(|\mathcal S|^2|\mathcal A|^2 (1-\gamma)^{-5}\varepsilon^{-2})$.

Q-Learning

Multi-Agent Policy Transfer via Task Relationship Modeling

no code implementations9 Mar 2022 Rongjun Qin, Feng Chen, Tonghan Wang, Lei Yuan, Xiaoran Wu, Zongzhang Zhang, Chongjie Zhang, Yang Yu

We demonstrate that the task representation can capture the relationship among tasks, and can generalize to unseen tasks.

Transfer Learning

Rethinking ValueDice: Does It Really Improve Performance?

no code implementations5 Feb 2022 Ziniu Li, Tian Xu, Yang Yu, Zhi-Quan Luo

First, we show that ValueDice could reduce to BC under the offline setting.

Imitation Learning

Online Allocation Problem with Two-sided Resource Constraints

no code implementations28 Dec 2021 Qixin Zhang, Wenbing Ye, Zaiyi Chen, Haoyuan Hu, Enhong Chen, Yang Yu

As a result, only limited violations of constraints or pessimistic competitive bounds could be guaranteed.

Decision Making Fairness +1

RetroComposer: Composing Templates for Template-Based Retrosynthesis Prediction

1 code implementation20 Dec 2021 Chaochao Yan, Peilin Zhao, Chan Lu, Yang Yu, Junzhou Huang

To overcome this limitation, we propose an innovative retrosynthesis prediction framework that can compose novel templates beyond training templates.

Retrosynthesis Single-step retrosynthesis

Progressive Multi-stage Interactive Training in Mobile Network for Fine-grained Recognition

no code implementations8 Dec 2021 Zhenxin Wu, Qingliang Chen, Yifeng Liu, Yinqi Zhang, Chengkai Zhu, Yang Yu

Finally, using the progressive training (P), the features extracted by the model in different stages can be fully utilized and fused with each other.

Fine-Grained Image Classification

Tiny-NewsRec: Effective and Efficient PLM-based News Recommendation

1 code implementation2 Dec 2021 Yang Yu, Fangzhao Wu, Chuhan Wu, Jingwei Yi, Qi Liu

We further propose a two-stage knowledge distillation method to improve the efficiency of the large PLM-based news recommendation model while maintaining its performance.

Knowledge Distillation Natural Language Understanding +1

Offline Model-based Adaptable Policy Learning

1 code implementation NeurIPS 2021 Xiong-Hui Chen, Yang Yu, Qingyang Li, Fan-Ming Luo, Zhiwei Qin, Wenjie Shang, Jieping Ye

Current offline reinforcement learning methods commonly learn in the policy space constrained to in-support regions by the offline dataset, in order to ensure the robustness of the outcome policies.

Decision Making reinforcement-learning +1

Cross-modal Domain Adaptation for Cost-Efficient Visual Reinforcement Learning

1 code implementation NeurIPS 2021 Xiong-Hui Chen, Shengyi Jiang, Feng Xu, Zongzhang Zhang, Yang Yu

Experiments on MuJoCo and Hand Manipulation Suite tasks show that the agents deployed with our method achieve similar performance as it has in the source domain, while those deployed with previous methods designed for same-modal domain adaptation suffer a larger performance gap.

Domain Adaptation reinforcement-learning +1

Stochastic optimal scheduling of demand response-enabled microgrids with renewable generations: An analytical-heuristic approach

no code implementations24 Nov 2021 Yang Li, Kang Li, Zhen Yang, Yang Yu, Runnan Xu, Miaosen Yang

In order to solve this model, this research combines Jaya algorithm and interior point method (IPM) to develop a hybrid analysis-heuristic solution method called Jaya-IPM, where the lower- and upper- levels are respectively addressed by the IPM and the Jaya, and the scheduling scheme is obtained via iterations between the two levels.

Scheduling

Calculus of Consent via MARL: Legitimating the Collaborative Governance Supplying Public Goods

no code implementations20 Nov 2021 Yang Hu, Zhui Zhu, Sirui Song, Xue Liu, Yang Yu

Experimental results in an exemplary environment show that our MARL approach is able to demonstrate the effectiveness and necessity of restrictions on individual liberty for collaborative supply of public goods.

Multi-agent Reinforcement Learning

Learning Efficient Online 3D Bin Packing on Packing Configuration Trees

1 code implementation ICLR 2022 Hang Zhao, Yang Yu, Kai Xu

PCT is a full-fledged description of the state and action space of bin packing which can support packing policy learning based on deep reinforcement learning (DRL).

3D Bin Packing

UserBERT: Contrastive User Model Pre-training

no code implementations3 Sep 2021 Chuhan Wu, Fangzhao Wu, Yang Yu, Tao Qi, Yongfeng Huang, Xing Xie

Two self-supervision tasks are incorporated in UserBERT for user model pre-training on unlabeled user behavior data to empower user modeling.

Neural-to-Tree Policy Distillation with Policy Improvement Criterion

no code implementations16 Aug 2021 Zhao-Hua Li, Yang Yu, Yingfeng Chen, Ke Chen, Zhipeng Hu, Changjie Fan

The empirical results show that the proposed method can preserve a higher cumulative reward than behavior cloning and learn a more consistent policy to the original one.

Decision Making reinforcement-learning +1

On Generalization of Adversarial Imitation Learning and Beyond

no code implementations19 Jun 2021 Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo

For some MDPs, we show that vanilla AIL has a worse sample complexity than BC.

Imitation Learning

HieRec: Hierarchical User Interest Modeling for Personalized News Recommendation

no code implementations ACL 2021 Tao Qi, Fangzhao Wu, Chuhan Wu, Peiru Yang, Yang Yu, Xing Xie, Yongfeng Huang

Instead of a single user embedding, in our method each user is represented in a hierarchical interest tree to better capture their diverse and multi-grained interest in news.

News Recommendation

Context-Aware Sparse Deep Coordination Graphs

1 code implementation ICLR 2022 Tonghan Wang, Liang Zeng, Weijun Dong, Qianlan Yang, Yang Yu, Chongjie Zhang

Learning sparse coordination graphs adaptive to the coordination dynamics among agents is a long-standing problem in cooperative multi-agent learning.

graph construction Graph Learning +2

Active Hierarchical Exploration with Stable Subgoal Representation Learning

1 code implementation ICLR 2022 Siyuan Li, Jin Zhang, Jianhao Wang, Yang Yu, Chongjie Zhang

Although GCHRL possesses superior exploration ability by decomposing tasks via subgoals, existing GCHRL methods struggle in temporally extended tasks with sparse external rewards, since the high-level policy learning relies on external rewards.

Continuous Control Hierarchical Reinforcement Learning +1

Reinforcement Learning With Sparse-Executing Actions via Sparsity Regularization

no code implementations18 May 2021 Jing-Cheng Pang, Tian Xu, Shengyi Jiang, Yu-Ren Liu, Yang Yu

Reinforcement learning (RL) has made remarkable progress in many decision-making tasks, such as Go, game playing, and robotics control.

Atari Games Decision Making +3

An Introduction of mini-AlphaStar

1 code implementation14 Apr 2021 Ruo-Ze Liu, Wenhai Wang, Yanjie Shen, Zhiqi Li, Yang Yu, Tong Lu

StarCraft II (SC2) is a real-time strategy game in which players produce and control multiple units to fight against opponent's units.

Starcraft Starcraft II

Distributed Bootstrap for Simultaneous Inference Under High Dimensionality

1 code implementation19 Feb 2021 Yang Yu, Shih-Kang Chao, Guang Cheng

We propose a distributed bootstrap method for simultaneous inference on high-dimensional massive data that are stored and processed with many machines.

Vocal Bursts Intensity Prediction

Derivative-Free Reinforcement Learning: A Review

no code implementations10 Feb 2021 Hong Qian, Yang Yu

In this article, we summarize methods of derivative-free reinforcement learning to date, and organize the methods in aspects including parameter updating, model selection, exploration, and parallel/distributed methods.

Model Selection reinforcement-learning +1

NewsBERT: Distilling Pre-trained Language Model for Intelligent News Application

no code implementations Findings (EMNLP) 2021 Chuhan Wu, Fangzhao Wu, Yang Yu, Tao Qi, Yongfeng Huang, Qi Liu

However, existing language models are pre-trained and distilled on general corpus like Wikipedia, which has some gaps with the news domain and may be suboptimal for news intelligence.

Knowledge Distillation Language Modelling +2

The Flare and Warp of the Young Stellar Disk traced with LAMOST DR5 OB-type stars

no code implementations1 Feb 2021 Yang Yu, Hai-Feng Wang, Wen-Yuan Cui, Lin-Lin Li, Chao Liu, Bo Zhang, Hao Tian, Zhen-Yan Huo, Jie Ju, Zhi-Cun Liu, Fang Wen, Shuai Feng

We present analysis of the spatial density structure for the outer disk from 8$-$14 \, kpc with the LAMOST DR5 13534 OB-type stars and observe similar flaring on north and south sides of the disk implying that the flaring structure is symmetrical about the Galactic plane, for which the scale height at different Galactocentric distance is from 0. 14 to 0. 5 \, kpc.

Astrophysics of Galaxies

NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning

3 code implementations1 Feb 2021 Rongjun Qin, Songyi Gao, Xingyuan Zhang, Zhen Xu, Shengkai Huang, Zewen Li, Weinan Zhang, Yang Yu

We evaluate existing offline RL algorithms on NeoRL and argue that the performance of a policy should also be compared with the deterministic version of the behavior policy, instead of the dataset reward.

Offline RL reinforcement-learning +1

ASBSO: An Improved Brain Storm Optimization With Flexible Search Length and Memory-Based Selection

no code implementations27 Jan 2021 Yang Yu, Shangce Gao, Yirui Wang, Jiujun Cheng, Yuki Todo

This proposed method, adaptive step length based on memory selection BSO, namely ASBSO, applies multiple step lengths to modify the generation process of new solutions, thus supplying a flexible search according to corresponding problems and convergent periods.

Cross-Modal Domain Adaptation for Reinforcement Learning

1 code implementation1 Jan 2021 Xiong-Hui Chen, Shengyi Jiang, Feng Xu, Yang Yu

Domain adaptation is a promising direction for deploying RL agents in real-world applications, where vision-based robotics tasks constitute an important part.

Domain Adaptation reinforcement-learning +1

Offline Adaptive Policy Leaning in Real-World Sequential Recommendation Systems

no code implementations1 Jan 2021 Xiong-Hui Chen, Yang Yu, Qingyang Li, Zhiwei Tony Qin, Wenjie Shang, Yiping Meng, Jieping Ye

Instead of increasing the fidelity of models for policy learning, we handle the distortion issue via learning to adapt to diverse simulators generated by the offline dataset.

Sequential Recommendation

Interactive Search Based on Deep Reinforcement Learning

no code implementations9 Dec 2020 Yang Yu, Zhenhao Gu, Rong Tao, Jingtian Ge, Kenglun Chang

With the continuous development of machine learning technology, major e-commerce platforms have launched recommendation systems based on it to serve a large number of customers with different needs more efficiently.

Clustering Decision Making +3

Offline Imitation Learning with a Misspecified Simulator

no code implementations NeurIPS 2020 Shengyi Jiang, JingCheng Pang, Yang Yu

In this work, we investigate policy learning in the condition of a few expert demonstrations and a simulator with misspecified dynamics.

Friction Imitation Learning

OrgMining 2.0: A Novel Framework for Organizational Model Mining from Event Logs

no code implementations24 Nov 2020 Jing Yang, Chun Ouyang, Wil M. P. van der Aalst, Arthur H. M. ter Hofstede, Yang Yu

We demonstrate the feasibility of this framework by proposing an approach underpinned by the framework for organizational model discovery, and also conduct experiments on real-life event logs to discover and evaluate organizational models.

Model Discovery

Angular Embedding: A New Angular Robust Principal Component Analysis

no code implementations22 Nov 2020 Shenglan Liu, Yang Yu

As a widely used method in machine learning, principal component analysis (PCA) shows excellent properties for dimensionality reduction.

Dimensionality Reduction

Mining Generalized Features for Detecting AI-Manipulated Fake Faces

no code implementations27 Oct 2020 Yang Yu, Rongrong Ni, Yao Zhao

Recently, AI-manipulated face techniques have developed rapidly and constantly, which has raised new security issues in society.

Error Bounds of Imitating Policies and Environments

no code implementations NeurIPS 2020 Tian Xu, Ziniu Li, Yang Yu

In this paper, we firstly analyze the value gap between the expert policy and imitated policies by two imitation methods, behavioral cloning and generative adversarial imitation.

Imitation Learning Model-based Reinforcement Learning +2

Multiple-element joint detection for Aspect-Based Sentiment Analysis

no code implementations Knowledge Based Systems 2020 Chao Wu, Qingyu Xiong, Hualing Yi, Yang Yu, Qiwu Zhu, Min Gao, Jie Chen

In this paper, we propose a novel end-to-end multiple-element joint detection model (MEJD), which effectively extracts all (target, aspect, sentiment) triples from a sentence.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2

Difference-in-Differences: Bridging Normalization and Disentanglement in PG-GAN

no code implementations16 Oct 2020 Xiao Liu, Jiajie Zhang, Siting Li, Zuotong Wu, Yang Yu

We discover that pixel normalization causes object entanglement by in-painting the area occupied by ablated objects.

counterfactual Disentanglement +1

TurboTransformers: An Efficient GPU Serving System For Transformer Models

no code implementations9 Oct 2020 Jiarui Fang, Yang Yu, Chengduo Zhao, Jie zhou

This paper designed a transformer serving system called TurboTransformers, which consists of a computing runtime and a serving framework to solve the above challenges.

Management

Reinforced Epidemic Control: Saving Both Lives and Economy

1 code implementation4 Aug 2020 Sirui Song, Zefang Zong, Yong Li, Xue Liu, Yang Yu

Saving lives or economy is a dilemma for epidemic control in most cities while smart-tracing technology raises people's privacy concerns.

reinforcement-learning Reinforcement Learning (RL)

QPLEX: Duplex Dueling Multi-Agent Q-Learning

5 code implementations ICLR 2021 Jianhao Wang, Zhizhou Ren, Terry Liu, Yang Yu, Chongjie Zhang

This paper presents a novel MARL approach, called duPLEX dueling multi-agent Q-learning (QPLEX), which takes a duplex dueling network architecture to factorize the joint value function.

Decision Making Multi-agent Reinforcement Learning +3

Local Neighbor Propagation Embedding

no code implementations29 Jun 2020 Shenglan Liu, Yang Yu

Manifold Learning occupies a vital role in the field of nonlinear dimensionality reduction and its ideas also serve for other relevant methods.

Dimensionality Reduction

Affect inTweets: A Transfer Learning Approach

no code implementations LREC 2020 Linrui Zhang, Hsin-Lun Huang, Yang Yu, Dan Moldovan

As opposed to the traditional machine learning models which require considerable effort in designing task specific features, our model can be well adapted to the proposed tasks with a very limited amount of fine-tuning, which significantly reduces the manual effort in feature engineering.

Feature Engineering Transfer Learning

AliExpress Learning-To-Rank: Maximizing Online Model Performance without Going Online

no code implementations25 Mar 2020 Guangda Huzhang, Zhen-Jia Pang, Yongqing Gao, Yawen Liu, Weijie Shen, Wen-Ji Zhou, Qing Da, An-Xiang Zeng, Han Yu, Yang Yu, Zhi-Hua Zhou

The framework consists of an evaluator that generalizes to evaluate recommendations involving the context, and a generator that maximizes the evaluator score by reinforcement learning, and a discriminator that ensures the generalization of the evaluator.

Learning-To-Rank

Novelty-Prepared Few-Shot Classification

1 code implementation1 Mar 2020 Chao Wang, Ruo-Ze Liu, Han-Jia Ye, Yang Yu

We disclose that a classically fully trained feature extractor can leave little embedding space for unseen classes, which keeps the model from well-fitting the new classes.

Classification General Classification

Simultaneous Inference for Massive Data: Distributed Bootstrap

no code implementations ICML 2020 Yang Yu, Shih-Kang Chao, Guang Cheng

In this paper, we propose a bootstrap method applied to massive data processed distributedly in a large number of machines.

Residual Bootstrap Exploration for Bandit Algorithms

no code implementations19 Feb 2020 Chi-Hua Wang, Yang Yu, Botao Hao, Guang Cheng

In this paper, we propose a novel perturbation-based exploration method in bandit algorithms with bounded or unbounded rewards, called residual bootstrap exploration (\texttt{ReBoot}).

Computational Efficiency Multi-Armed Bandits +1

Temporal-adaptive Hierarchical Reinforcement Learning

no code implementations6 Feb 2020 Wen-Ji Zhou, Yang Yu

Hierarchical reinforcement learning (HRL) helps address large-scale and sparse reward issues in reinforcement learning.

Atari Games Hierarchical Reinforcement Learning +2

Robust Data-driven Profile-based Pricing Schemes

no code implementations12 Dec 2019 Jingshi Cui, Haoxiang Wang, Chenye Wu, Yang Yu

To enable an efficient electricity market, a good pricing scheme is of vital importance.

Clustering

A Data-driven Storage Control Framework for Dynamic Pricing

no code implementations1 Dec 2019 Jiaman Wu, Zhiqi Wang, Chenye Wu, Kui Wang, Yang Yu

Dynamic pricing is both an opportunity and a challenge to the demand side.

Bridging Machine Learning and Logical Reasoning by Abductive Learning

1 code implementation NeurIPS 2019 Wang-Zhou Dai, Qiu-Ling Xu, Yang Yu, Zhi-Hua Zhou

In the area of artificial intelligence (AI), the two abilities are usually realised by machine learning and logic programming, respectively.

BIG-bench Machine Learning Logical Reasoning

Improving Fictitious Play Reinforcement Learning with Expanding Models

no code implementations27 Nov 2019 Rong-Jun Qin, Jing-Cheng Pang, Yang Yu

However, learning to beat a pool in stochastic games, i. e., a wide distribution over policy models, is either sample-consuming or insufficient to exploit all models with limited amount of samples.

reinforcement-learning Reinforcement Learning (RL)

Vulnerability Analysis for Data Driven Pricing Schemes

no code implementations18 Nov 2019 Jingshi Cui, Haoxiang Wang, Chenye Wu, Yang Yu

In this paper, from an adversarial machine learning point of view, we examine the vulnerability of data-driven electricity market design.

BIG-bench Machine Learning Clustering

On Value Discrepancy of Imitation Learning

no code implementations16 Nov 2019 Tian Xu, Ziniu Li, Yang Yu

We also show that the framework leads to the value discrepancy of GAIL in an order of O((1-\gamma)^{-1}).

Imitation Learning

Optimal Storage Control for Dynamic Pricing

no code implementations16 Nov 2019 Jiaman Wu, Zhiqi Wang, Yang Yu, Chenye Wu

Renewable energy brings huge uncertainties to the power system, which challenges the traditional power system operation with limited flexible resources.

Systems and Control Systems and Control Optimization and Control

Conductor Galloping Prediction on Imbalanced Datasets: SVM with Smart Sampling

no code implementations9 Nov 2019 Kui Wang, Jian Sun, Chenye Wu, Yang Yu

Conductor galloping is the high-amplitude, low-frequency oscillation of overhead power lines due to wind.

Signal Combination for Language Identification

no code implementations21 Oct 2019 Shengye Wang, Li Wan, Yang Yu, Ignacio Lopez Moreno

We compare the performance of a lattice-based ensemble model and a deep neural network model to combine signals from recognizers with that of a baseline that only uses low-level acoustic signals.

Language Identification speech-recognition +1

Deep exploration by novelty-pursuit with maximum state entropy

no code implementations25 Sep 2019 Zi-Niu Li, Xiong-Hui Chen, Yang Yu

Efficient exploration is essential to reinforcement learning in huge state space.

Efficient Exploration

Hierarchic Neighbors Embedding

no code implementations16 Sep 2019 Shenglan Liu, Yang Yu, Yang Liu, Hong Qiao, Lin Feng, Jiashi Feng

Manifold learning now plays a very important role in machine learning and many relevant applications.

On the Robustness of Median Sampling in Noisy Evolutionary Optimization

no code implementations28 Jul 2019 Chao Bian, Chao Qian, Yang Yu, Ke Tang

Sampling is a popular strategy, which evaluates the objective a couple of times, and employs the mean of these evaluation results as an estimate of the objective value.

Evolutionary Algorithms

Running Time Analysis of the (1+1)-EA for Robust Linear Optimization

no code implementations17 Jun 2019 Chao Bian, Chao Qian, Ke Tang, Yang Yu

Evolutionary algorithms (EAs) have found many successful real-world applications, where the optimization problems are often subject to a wide range of uncertainties.

Evolutionary Algorithms

Key Ingredients of Self-Driving Cars

no code implementations7 Jun 2019 Rui Fan, Jianhao Jiao, Haoyang Ye, Yang Yu, Ioannis Pitas, Ming Liu

Over the past decade, many research articles have been published in the area of autonomous driving.

Autonomous Driving Self-Driving Cars

Knowledge-augmented Column Networks: Guiding Deep Learning with Advice

no code implementations31 May 2019 Mayukh Das, Devendra Singh Dhami, Yang Yu, Gautam Kunapuli, Sriraam Natarajan

Recently, deep models have had considerable success in several tasks, especially with low-level representations.

BIG-bench Machine Learning

Cascaded Algorithm-Selection and Hyper-Parameter Optimization with Extreme-Region Upper Confidence Bound Bandit

no code implementations31 May 2019 Yi-Qi Hu, Yang Yu, Jun-Da Liao

We show theoretically that the ER-UCB has a regret upper bound $O\left(K \ln n\right)$ with independent feedbacks, which is as efficient as the classical UCB bandit.

AutoML

Computer-aided Detection of Squamous Carcinoma of the Cervix in Whole Slide Images

no code implementations27 May 2019 Ye Tian, Li Yang, Wei Wang, Jing Zhang, Qing Tang, Mili Ji, Yang Yu, Yu Li, Hong Yang, Airong Qian

Traditionally, the most indispensable diagnosis of cervix squamous carcinoma is histopathological assessment which is achieved under microscope by pathologist.

whole slide images

Automatic Calibration of Multiple 3D LiDARs in Urban Environments

no code implementations13 May 2019 Jianhao Jiao, Yang Yu, Qinghai Liao, Haoyang Ye, Ming Liu

Multiple LiDARs have progressively emerged on autonomous vehicles for rendering a wide field of view and dense measurements.

Autonomous Vehicles Translation

Human-Guided Column Networks: Augmenting Deep Learning with Advice

no code implementations ICLR 2019 Mayukh Das, Yang Yu, Devendra Singh Dhami, Gautam Kunapuli, Sriraam Natarajan

While extremely successful in several applications, especially with low-level representations; sparse, noisy samples and structured domains (with multiple objects and interactions) are some of the open challenges in most deep models.

A Novel Dual-Lidar Calibration Algorithm Using Planar Surfaces

no code implementations27 Apr 2019 Jianhao Jiao, Qinghai Liao, Yilong Zhu, Tianyu Liu, Yang Yu, Rui Fan, Lujia Wang, Ming Liu

Multiple lidars are prevalently used on mobile vehicles for rendering a broad view to enhance the performance of localization and perception systems.

Translation

Human-Guided Learning of Column Networks: Augmenting Deep Learning with Advice

no code implementations15 Apr 2019 Mayukh Das, Yang Yu, Devendra Singh Dhami, Gautam Kunapuli, Sriraam Natarajan

Recently, deep models have been successfully applied in several applications, especially with low-level representations.

PointIT: A Fast Tracking Framework Based on 3D Instance Segmentation

no code implementations18 Feb 2019 Yu-An Wang, Yang Yu, Ming Liu

Finally, we extend the Sort algorithm with this instance framework to realize tracking in the 3D LiDAR point cloud data.

3D Instance Segmentation Semantic Segmentation

Tuplemax Loss for Language Identification

1 code implementation29 Nov 2018 Li Wan, Prashant Sridhar, Yang Yu, Quan Wang, Ignacio Lopez Moreno

In many scenarios of a language identification task, the user will specify a small set of languages which he/she can speak instead of a large set of all possible languages.

Language Identification

Day-to-Day Dynamic Traffic Assignment with Imperfect Information, Bounded Rationality and Information Sharing

1 code implementation26 Nov 2018 Yang Yu, Ke Han, Washington Ochieng

These two variants, serving as based models, are further extended with two features: bounded rationality (BR) and information sharing.

Physics and Society Optimization and Control

Analysis of Noisy Evolutionary Optimization When Sampling Fails

no code implementations11 Oct 2018 Chao Qian, Chao Bian, Yang Yu, Ke Tang, Xin Yao

In noisy evolutionary optimization, sampling is a common strategy to deal with noise.

Exploration by Uncertainty in Reward Space

no code implementations27 Sep 2018 Wei-Yang Qu, Yang Yu, Tang-Jie Lv, Ying-Feng Chen, Chang-Jie Fan

There are two policies in this approach, the exploration policy is used for exploratory sampling in the environment, then the benchmark policy try to update by the data proven by the exploration policy.

Atari Games Efficient Exploration +1

Multi-Layered Gradient Boosting Decision Trees

1 code implementation NeurIPS 2018 Ji Feng, Yang Yu, Zhi-Hua Zhou

Multi-layered representation is believed to be the key ingredient of deep neural networks especially in cognitive tasks like computer vision.

Representation Learning

Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application

1 code implementation2 Mar 2018 Yujing Hu, Qing Da, An-Xiang Zeng, Yang Yu, Yinghui Xu

For better utilizing the correlation between different ranking steps, in this paper, we propose to use reinforcement learning (RL) to learn an optimal ranking policy which maximizes the expected accumulative rewards in a search session.

Decision Making Learning-To-Rank +1

Tunneling Neural Perception and Logic Reasoning through Abductive Learning

1 code implementation4 Feb 2018 Wang-Zhou Dai, Qiu-Ling Xu, Yang Yu, Zhi-Hua Zhou

Perception and reasoning are basic human abilities that are seamlessly connected as part of human intelligence.

ZOOpt: Toolbox for Derivative-Free Optimization

3 code implementations31 Dec 2017 Yu-Ren Liu, Yi-Qi Hu, Hong Qian, Chao Qian, Yang Yu

Recent advances in derivative-free optimization allow efficient approximation of the global-optimal solutions of sophisticated functions, such as functions with many local optima, non-differentiable and non-continuous functions.

BIG-bench Machine Learning Distributed Optimization

Subset Selection under Noise

no code implementations NeurIPS 2017 Chao Qian, Jing-Cheng Shi, Yang Yu, Ke Tang, Zhi-Hua Zhou

The problem of selecting the best $k$-element subset from a universe is involved in many applications.

Maximizing Submodular or Monotone Approximately Submodular Functions by Multi-objective Evolutionary Algorithms

no code implementations20 Nov 2017 Chao Qian, Yang Yu, Ke Tang, Xin Yao, Zhi-Hua Zhou

To provide a general theoretical explanation of the behavior of EAs, it is desirable to study their performance on general classes of combinatorial optimization problems.

Combinatorial Optimization Evolutionary Algorithms

Open-Category Classification by Adversarial Sample Generation

no code implementations24 May 2017 Yang Yu, Wei-Yang Qu, Nan Li, Zimin Guo

ASG generates positive and negative samples of seen categories in the unsupervised manner via an adversarial learning strategy.

Classification General Classification

End-to-End Answer Chunk Extraction and Ranking for Reading Comprehension

no code implementations31 Oct 2016 Yang Yu, Wei zhang, Kazi Hasan, Mo Yu, Bing Xiang, Bo-Wen Zhou

This paper proposes dynamic chunk reader (DCR), an end-to-end neural reading comprehension (RC) model that is able to extract and rank a set of answer candidates from a given document to answer questions.

Question Answering Reading Comprehension

A Lower Bound Analysis of Population-based Evolutionary Algorithms for Pseudo-Boolean Functions

no code implementations10 Jun 2016 Chao Qian, Yang Yu, Zhi-Hua Zhou

Our results imply that the increase of population size, while usually desired in practice, bears the risk of increasing the lower bound of the running time and thus should be carefully considered.

Evolutionary Algorithms

Subset Selection by Pareto Optimization

no code implementations NeurIPS 2015 Chao Qian, Yang Yu, Zhi-Hua Zhou

Selecting the optimal subset from a large set of variables is a fundamental problem in various learning tasks such as feature selection, sparse regression, dictionary learning, etc.

Dictionary Learning feature selection +1

Empirical Study on Deep Learning Models for Question Answering

no code implementations26 Oct 2015 Yang Yu, Wei zhang, Chung-Wei Hang, Bing Xiang, Bo-Wen Zhou

In this paper we explore deep learning models with memory component or attention mechanism for question answering task.

Machine Translation Question Answering +1

Structured Memory for Neural Turing Machines

no code implementations14 Oct 2015 Wei Zhang, Yang Yu, Bo-Wen Zhou

Neural Turing Machines (NTM) contain memory component that simulates "working memory" in the brain to store and retrieve information to ease simple algorithms learning.

Recognizing Extended Spatiotemporal Expressions by Actively Trained Average Perceptron Ensembles

no code implementations19 Aug 2015 Wei Zhang, Yang Yu, Osho Gupta, Judith Gelernter

We collected and annotated data set by querying commercial web searches API with such spatiotemporal expressions as were missed by state-of-the- art parsers.

Active Learning Ensemble Learning +1

The Sampling-and-Learning Framework: A Statistical View of Evolutionary Algorithms

no code implementations24 Jan 2014 Yang Yu, Hong Qian

By summarizing a large range of EAs into the sampling-and-learning framework, we show that the framework directly admits a general analysis on the probable-absolute-approximate (PAA) query complexity.

Binary Classification Evolutionary Algorithms +2

Analyzing Evolutionary Optimization in Noisy Environments

no code implementations20 Nov 2013 Chao Qian, Yang Yu, Zhi-Hua Zhou

On a representative problem where the noise has a strong negative effect, we examine two commonly employed mechanisms in EAs dealing with noise, the re-evaluation and the threshold selection strategies.

Evolutionary Algorithms

Cannot find the paper you are looking for? You can Submit a new open access paper.