Search Results for author: Jing Yu

Found 42 papers, 18 papers with code

Memory-based Cross-modal Semantic Alignment Network for Radiology Report Generation

no code implementations31 Mar 2024 Yitian Tao, Liyan Ma, Jing Yu, Han Zhang

To ensure the semantic consistency of the retrieved cross modal prior knowledge, a cross-modal semantic alignment module (SAM) is proposed.

InternLM2 Technical Report

1 code implementation26 Mar 2024 Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang, Penglong Jiao, Zhenjiang Jin, Zhikai Lei, Jiaxing Li, Jingwen Li, Linyang Li, Shuaibin Li, Wei Li, Yining Li, Hongwei Liu, Jiangning Liu, Jiawei Hong, Kaiwen Liu, Kuikun Liu, Xiaoran Liu, Chengqi Lv, Haijun Lv, Kai Lv, Li Ma, Runyuan Ma, Zerun Ma, Wenchang Ning, Linke Ouyang, Jiantao Qiu, Yuan Qu, FuKai Shang, Yunfan Shao, Demin Song, Zifan Song, Zhihao Sui, Peng Sun, Yu Sun, Huanze Tang, Bin Wang, Guoteng Wang, Jiaqi Wang, Jiayu Wang, Rui Wang, Yudong Wang, Ziyi Wang, Xingjian Wei, Qizhen Weng, Fan Wu, Yingtong Xiong, Chao Xu, Ruiliang Xu, Hang Yan, Yirong Yan, Xiaogui Yang, Haochen Ye, Huaiyuan Ying, JIA YU, Jing Yu, Yuhang Zang, Chuyu Zhang, Li Zhang, Pan Zhang, Peng Zhang, Ruijie Zhang, Shuo Zhang, Songyang Zhang, Wenjian Zhang, Wenwei Zhang, Xingcheng Zhang, Xinyue Zhang, Hui Zhao, Qian Zhao, Xiaomeng Zhao, Fengzhe Zhou, Zaida Zhou, Jingming Zhuo, Yicheng Zou, Xipeng Qiu, Yu Qiao, Dahua Lin

The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI).

4k Long-Context Understanding

Scientific Large Language Models: A Survey on Biological & Chemical Domains

1 code implementation26 Jan 2024 Qiang Zhang, Keyang Ding, Tianwen Lyv, Xinda Wang, Qingyu Yin, Yiwen Zhang, Jing Yu, Yuhao Wang, Xiaotong Li, Zhuoyi Xiang, Xiang Zhuang, Zeyuan Wang, Ming Qin, Mengyao Zhang, Jinlu Zhang, Jiyu Cui, Renjun Xu, Hongyang Chen, Xiaohui Fan, Huabin Xing, Huajun Chen

Large Language Models (LLMs) have emerged as a transformative power in enhancing natural language comprehension, representing a significant stride toward artificial general intelligence.

Binary Linear Tree Commitment-based Ownership Protection for Distributed Machine Learning

no code implementations11 Jan 2024 Tianxiu Xie, Keke Gai, Jing Yu, Liehuang Zhu

Distributed machine learning enables parallel training of extensive datasets by delegating computing tasks across multiple workers.

Watermarking Vision-Language Pre-trained Models for Multi-modal Embedding as a Service

1 code implementation10 Nov 2023 Yuanmin Tang, Jing Yu, Keke Gai, Xiangyan Qu, Yue Hu, Gang Xiong, Qi Wu

Our extensive experiments on various datasets indicate that the proposed watermarking approach is effective and safe for verifying the copyright of VLPs for multi-modal EaaS and robust against model extraction attacks.

Model extraction

VFedMH: Vertical Federated Learning for Training Multiple Heterogeneous Models

no code implementations20 Oct 2023 Shuo Wang, Keke Gai, Jing Yu, Liehuang Zhu, Kim-Kwang Raymond Choo, Bin Xiao

Then the passive party, who owns only features of the sample, injects the blinding factor into the local embedding and sends it to the active party.

Vertical Federated Learning

Context-I2W: Mapping Images to Context-dependent Words for Accurate Zero-Shot Composed Image Retrieval

1 code implementation28 Sep 2023 Yuanmin Tang, Jing Yu, Keke Gai, Jiamin Zhuang, Gang Xiong, Yue Hu, Qi Wu

Different from Composed Image Retrieval task that requires expensive labels for training task-specific models, Zero-Shot Composed Image Retrieval (ZS-CIR) involves diverse tasks with a broad range of visual content manipulation intent that could be related to domain, scene, object, and attribute.

Attribute Image Retrieval +4

Align before Search: Aligning Ads Image to Text for Accurate Cross-Modal Sponsored Search

1 code implementation28 Sep 2023 Yuanmin Tang, Jing Yu, Keke Gai, Yujing Wang, Yue Hu, Gang Xiong, Qi Wu

Conventional research mainly studies from the view of modeling the implicit correlations between images and texts for query-ads matching, ignoring the alignment of detailed product information and resulting in suboptimal search performance. In this work, we propose a simple alignment network for explicitly mapping fine-grained visual parts in ads images to the corresponding text, which leverages the co-occurrence structure consistency between vision and language spaces without requiring expensive labeled training data.

Image-text matching Natural Language Queries

Learning the Uncertainty Sets for Control Dynamics via Set Membership: A Non-Asymptotic Analysis

no code implementations26 Sep 2023 YingYing Li, Jing Yu, Lauren Conger, Adam Wierman

Further, this result is applied to robust adaptive model predictive control with uncertainty sets updated by set membership.

Model Predictive Control

Towards Fast and Accurate Image-Text Retrieval with Self-Supervised Fine-Grained Alignment

1 code implementation27 Aug 2023 Jiamin Zhuang, Jing Yu, Yang Ding, Xiangyan Qu, Yue Hu

Image-text retrieval requires the system to bridge the heterogenous gap between vision and language for accurate retrieval while keeping the network lightweight-enough for efficient retrieval.

Contrastive Learning Retrieval +1

A Comprehensive Survey for Evaluation Methodologies of AI-Generated Music

no code implementations26 Aug 2023 Zeyu Xiong, Weitao Wang, Jing Yu, Yue Lin, Ziyan Wang

In recent years, AI-generated music has made significant progress, with several models performing well in multimodal and complex musical genres and scenes.

Online learning for robust voltage control under uncertain grid topology

1 code implementation29 Jun 2023 Christopher Yeh, Jing Yu, Yuanyuan Shi, Adam Wierman

In this work, we combine a nested convex body chasing algorithm with a robust predictive controller to achieve provably finite-time convergence to safe voltage limits in the online setting where there is uncertainty in both the network topology as well as load and generation variations.

Convolution-enhanced Evolving Attention Networks

1 code implementation16 Dec 2022 Yujing Wang, Yaming Yang, Zhuo Li, Jiangang Bai, Mingliang Zhang, Xiangtai Li, Jing Yu, Ce Zhang, Gao Huang, Yunhai Tong

To the best of our knowledge, this is the first work that explicitly models the layer-wise evolution of attention maps.

Image Classification Machine Translation +3

On Infinite-horizon System Level Synthesis Problems

no code implementations28 Oct 2022 Olle Kjellqvist, Jing Yu

System level synthesis is a promising approach that formulates structured optimal controller synthesis problems as convex problems.

Differentiable Bilevel Programming for Stackelberg Congestion Games

1 code implementation15 Sep 2022 Jiayang Li, Jing Yu, Qianni Wang, Boyi Liu, Zhaoran Wang, Yu Marco Nie

A Stackelberg congestion game (SCG) is a bilevel program in which a leader aims to maximize their own gain by anticipating and manipulating the equilibrium state at which followers settle by playing a congestion game.

Robust Online Voltage Control with an Unknown Grid Topology

1 code implementation29 Jun 2022 Christopher Yeh, Jing Yu, Yuanyuan Shi, Adam Wierman

Voltage control generally requires accurate information about the grid's topology in order to guarantee network stability.

MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering

1 code implementation CVPR 2022 Yang Ding, Jing Yu, Bang Liu, Yue Hu, Mingxin Cui, Qi Wu

Knowledge-based visual question answering requires the ability of associating external knowledge for open-ended cross-modal scene understanding.

Implicit Relations Question Answering +2

Online Adversarial Stabilization of Unknown Networked Systems

no code implementations5 Mar 2022 Jing Yu, Dimitar Ho, Adam Wierman

We investigate the problem of stabilizing an unknown networked linear system under communication constraints and adversarial disturbances.

ET-BERT: A Contextualized Datagram Representation with Pre-training Transformers for Encrypted Traffic Classification

1 code implementation13 Feb 2022 Xinjie Lin, Gang Xiong, Gaopeng Gou, Zhen Li, Junzheng Shi, Jing Yu

In this paper, we propose a new traffic representation model called Encrypted Traffic Bidirectional Encoder Representations from Transformer (ET-BERT), which pre-trains deep contextualized datagram-level representation from large-scale unlabeled data.

Classification Management +1

Coarse-to-Careful: Seeking Semantic-related Knowledge for Open-domain Commonsense Question Answering

no code implementations4 Jul 2021 Luxi Xing, Yue Hu, Jing Yu, Yuqiang Xie, Wei Peng

It is prevalent to utilize external knowledge to help machine answer questions that need background commonsense, which faces a problem that unlimited knowledge will transmit noisy and misleading information.

Question Answering

MCR-Net: A Multi-Step Co-Interactive Relation Network for Unanswerable Questions on Machine Reading Comprehension

no code implementations8 Mar 2021 Wei Peng, Yue Hu, Jing Yu, Luxi Xing, Yuqiang Xie, Zihao Zhu, Yajing Sun

Most of the existing systems design a simple classifier to determine answerability implicitly without explicitly modeling mutual interaction and relation between the question and passage, leading to the poor performance for determining the unanswerable questions.

Machine Reading Comprehension Question Answering +2

Syntax-BERT: Improving Pre-trained Transformers with Syntax Trees

1 code implementation EACL 2021 Jiangang Bai, Yujing Wang, Yiren Chen, Yaming Yang, Jing Bai, Jing Yu, Yunhai Tong

Pre-trained language models like BERT achieve superior performances in various NLP tasks without explicit consideration of syntactic information.

Natural Language Understanding

One-photon Solutions to Multiqubit Multimode quantum Rabi model

no code implementations22 Feb 2021 Jie Peng, Juncong Zheng, Jing Yu, Pinghua Tang, G. Alvarado Barrios, Jianxin Zhong, Enrique Solano, F. Albarran-Arriagada, Lucas Lamata

General solutions to the quantum Rabi model involve subspaces with unbounded number of photons.

Quantum Physics Optics

Evolving Attention with Residual Convolutions

2 code implementations20 Feb 2021 Yujing Wang, Yaming Yang, Jiangang Bai, Mingliang Zhang, Jing Bai, Jing Yu, Ce Zhang, Gao Huang, Yunhai Tong

In this paper, we propose a novel and generic mechanism based on evolving attention to improve the performance of transformers.

Image Classification Machine Translation +2

Predictive Attention Transformer: Improving Transformer with Attention Map Prediction

no code implementations1 Jan 2021 Yujing Wang, Yaming Yang, Jiangang Bai, Mingliang Zhang, Jing Bai, Jing Yu, Ce Zhang, Yunhai Tong

Instead, we model their dependencies via a chain of prediction models that take previous attention maps as input to predict the attention maps of a new layer through convolutional neural networks.

Machine Translation

Bi-directional CognitiveThinking Network for Machine Reading Comprehension

no code implementations COLING 2020 Wei Peng, Yue Hu, Luxi Xing, Yuqiang Xie, Jing Yu, Yajing Sun, Xiangpeng Wei

We propose a novel Bi-directional Cognitive Knowledge Framework (BCKF) for reading comprehension from the perspective of complementary learning systems theory.

Machine Reading Comprehension

End-to-End Learning and Intervention in Games

no code implementations NeurIPS 2020 Jiayang Li, Jing Yu, Yu, Nie, Zhaoran Wang

In this paper, we provide a unified framework for learning and intervention in games.

Bi-directional Cognitive Thinking Network for Machine Reading Comprehension

no code implementations20 Oct 2020 Wei Peng, Yue Hu, Luxi Xing, Yuqiang Xie, Jing Yu, Yajing Sun, Xiangpeng Wei

We propose a novel Bi-directional Cognitive Knowledge Framework (BCKF) for reading comprehension from the perspective of complementary learning systems theory.

Machine Reading Comprehension

Localized and Distributed H2 State Feedback Control

no code implementations6 Oct 2020 Jing Yu, Yuh-Shyang Wang, James Anderson

Distributed linear control design is crucial for large-scale cyber-physical systems.

CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation

1 code implementation16 Sep 2020 Jing Yu, Yuan Chai, Yujing Wang, Yue Hu, Qi Wu

We first build a cognitive structure CogTree to organize the relationships based on the prediction of a biased SGG model.

Ranked #2 on Scene Graph Generation on Visual Genome (mean Recall @20 metric)

Graph Generation Unbiased Scene Graph Generation

Cross-modal Knowledge Reasoning for Knowledge-based Visual Question Answering

no code implementations31 Aug 2020 Jing Yu, Zihao Zhu, Yujing Wang, Weifeng Zhang, Yue Hu, Jianlong Tan

Finally, we perform graph neural networks to infer the global-optimal answer by jointly considering all the concepts.

Knowledge Graphs Question Answering +1

KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual Dialogue

no code implementations11 Aug 2020 Xiaoze Jiang, Siyi Du, Zengchang Qin, Yajing Sun, Jing Yu

Visual dialogue is a challenging task that needs to extract implicit information from both visual (image) and textual (dialogue history) contexts.

Information Retrieval Retrieval

DAM: Deliberation, Abandon and Memory Networks for Generating Detailed and Non-repetitive Responses in Visual Dialogue

4 code implementations7 Jul 2020 Xiaoze Jiang, Jing Yu, Yajing Sun, Zengchang Qin, Zihao Zhu, Yue Hu, Qi Wu

The ability of generating detailed and non-repetitive responses is crucial for the agent to achieve human-like conversation.

Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering

no code implementations16 Jun 2020 Zihao Zhu, Jing Yu, Yujing Wang, Yajing Sun, Yue Hu, Qi Wu

In this paper, we depict an image by a multi-modal heterogeneous graph, which contains multiple layers of information corresponding to the visual, semantic and factual features.

Question Answering Visual Question Answering

DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue

1 code implementation17 Nov 2019 Xiaoze Jiang, Jing Yu, Zengchang Qin, Yingying Zhuang, Xingxing Zhang, Yue Hu, Qi Wu

More importantly, we can tell which modality (visual or semantic) has more contribution in answering the current question by visualizing the gate values.

feature selection Question Answering +2

Scene Graph Reasoning with Prior Visual Relationship for Visual Question Answering

no code implementations23 Dec 2018 Zhuoqian Yang, Zengchang Qin, Jing Yu, Yue Hu

Upon the constructed graph, we propose a Scene Graph Convolutional Network (SceneGCN) to jointly reason the object properties and relational semantics for the correct answer.

Cross-Modal Information Retrieval Information Retrieval +2

Dual Reweighted Lp-Norm Minimization for Salt-and-pepper Noise Removal

no code implementations22 Nov 2018 Huiwen Dong, Jing Yu, Chuangbai Xiao

The robust principal component analysis (RPCA), which aims to estimate underlying low-rank and sparse structures from the degraded observation data, has found wide applications in computer vision.

Salt-And-Pepper Noise Removal

Edge-Based Blur Kernel Estimation Using Sparse Representation and Self-Similarity

no code implementations17 Nov 2018 Jing Yu, Zhenchun Chang, Chuangbai Xiao

Blind image deconvolution is the problem of recovering the latent image from the only observed blurry image when the blur kernel is unknown.

Deblurring Image Deconvolution

Textual Relationship Modeling for Cross-Modal Information Retrieval

1 code implementation31 Oct 2018 Jing Yu, Chenghao Yang, Zengchang Qin, Zhuoqian Yang, Yue Hu, Yanbing Liu

A joint neural model is proposed to learn feature representation individually in each modality.

Multimedia

Cannot find the paper you are looking for? You can Submit a new open access paper.