Search Results for author: Wei zhang

Found 521 papers, 159 papers with code

Towards Generalizeable Semantic Product Search by Text Similarity Pre-training on Search Click Logs

no code implementations ECNLP (ACL) 2022 Zheng Liu, Wei zhang, Yan Chen, Weiyi Sun, Tianchuan Du, Benjamin Schroeder

Recently, semantic search has been successfully applied to E-commerce product search and the learned semantic space for query and product encoding are expected to generalize well to unseen queries or products.

text similarity

HARD-Net: Hardness-AwaRe Discrimination Network for 3D Early Activity Prediction

no code implementations ECCV 2020 Tianjiao Li, Jun Liu, Wei zhang, Ling-Yu Duan

In this paper, we propose a novel Hardness-AwaRe Discrimination Network (HARD-Net) to specifically investigate the relationships between the similar activity pairs that are hard to be discriminated.

Activity Prediction Skeleton Based Action Recognition

Unsupervised Multi-View CNN for Salient View Selection of 3D Objects and Scenes

1 code implementation ECCV 2020 Ran Song, Wei zhang, Yitian Zhao, Yonghuai Liu

We present an unsupervised 3D deep learning framework based on a ubiquitously true proposition named by us view-object consistency as it states that a 3D object and its projected 2D views always belong to the same object class.

Object

DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection

no code implementations14 Apr 2024 Lewei Yao, Renjie Pi, Jianhua Han, Xiaodan Liang, Hang Xu, Wei zhang, Zhenguo Li, Dan Xu

This is followed by a fine-tuning stage that leverages a small number of high-resolution samples to further enhance detection performance.

Dense Captioning Language Modelling +4

Fast Gradient Computation for Gromov-Wasserstein Distance

no code implementations13 Apr 2024 Wei zhang, ZiHao Wang, Jie Fan, Hao Wu, Yong Zhang

In this way, the original computational bottleneck is broken and the new entropic solution can be obtained with total quadratic time, which is almost optimal complexity.

JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models

no code implementations12 Apr 2024 Yingchaojie Feng, Zhizhang Chen, Zhining Kang, Sijia Wang, Minfeng Zhu, Wei zhang, Wei Chen

Addressing these concerns necessitates a comprehensive analysis of jailbreak prompts to evaluate LLMs' defensive capabilities and identify potential weaknesses.

Accel-NASBench: Sustainable Benchmarking for Accelerator-Aware NAS

1 code implementation9 Apr 2024 Afzal Ahmad, Linfeng Du, Zhiyao Xie, Wei zhang

We present a technique that allows searching for training proxies that reduce the cost of benchmark construction by significant margins, making it possible to construct realistic NAS benchmarks for large-scale datasets.

Benchmarking Neural Architecture Search

Map Optical Properties to Subwavelength Structures Directly via a Diffusion Model

no code implementations9 Apr 2024 Shijie Rao, Kaiyu Cui, Yidong Huang, Jiawei Yang, YaLi Li, Shengjin Wang, Xue Feng, Fang Liu, Wei zhang

The inverse design methods proposed for these subwavelength structures are vital to the development of new photonic devices.

A diffusion MRI tractography atlas for concurrent white matter mapping across Eastern and Western populations

no code implementations6 Apr 2024 Yijie Li, Wei zhang, Ye Wu, Li Yin, Ce Zhu, Yuqian Chen, Suheyla Cetin-Karayumak, Kang Ik K Cho, Leo R. Zekelman, Jarrett Rushmore, Yogesh Rathi, Nikos Makris, Lauren J. O'Donnell, Fan Zhang

However, a comprehensive investigation into WM fiber tracts between Eastern and Western populations is challenged due to the lack of a cross-population WM atlas and the large site-specific variability of dMRI data.

Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection

no code implementations26 Mar 2024 Jiacheng Zhang, Jiaming Li, Xiangru Lin, Wei zhang, Xiao Tan, Junyu Han, Errui Ding, Jingdong Wang, Guanbin Li

Additionally, we present a DepthGradient Projection (DGP) module to mitigate optimization conflicts caused by noisy depth supervision of pseudo-labels, effectively decoupling the depth gradient and removing conflicting gradients.

Monocular 3D Object Detection object-detection +1

Gradient-based Sampling for Class Imbalanced Semi-supervised Object Detection

1 code implementation ICCV 2023 Jiaming Li, Xiangru Lin, Wei zhang, Xiao Tan, YingYing Li, Junyu Han, Errui Ding, Jingdong Wang, Guanbin Li

To tackle the confirmation bias from incorrect pseudo labels of minority classes, the class-rebalancing sampling module resamples unlabeled data following the guidance of the gradient-based reweighting module.

object-detection Object Detection +1

SFOD: Spiking Fusion Object Detector

1 code implementation22 Mar 2024 Yimeng Fan, Wei zhang, Changsong Liu, Mingyang Li, Wenrui Lu

Thereby, we establish state-of-the-art classification results based on SNNs, achieving 93. 7\% accuracy on the NCAR dataset.

Object object-detection +1

LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model

no code implementations18 Mar 2024 Runhui Huang, Kaixin Cai, Jianhua Han, Xiaodan Liang, Renjing Pei, Guansong Lu, Songcen Xu, Wei zhang, Hang Xu

Specifically, an inter-layer attention module is designed to encourage information exchange and learning between layers, while a text-guided intra-layer attention module incorporates layer-specific prompts to direct the specific-content generation for each layer.

Image Generation Style Transfer

OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation

no code implementations18 Mar 2024 Haochen Jiang, Yueming Xu, Yihan Zeng, Hang Xu, Wei zhang, Jianfeng Feng, Li Zhang

We model the geometric structure of the scene with occupancy representation and distill the pre-trained open vocabulary model into a 3D language field via volume rendering for zero-shot inference.

3D Reconstruction 3D Scene Reconstruction +3

Affective Behaviour Analysis via Integrating Multi-Modal Knowledge

no code implementations16 Mar 2024 Wei zhang, Feng Qiu, Chen Liu, Lincheng Li, Heming Du, Tiancheng Guo, Xin Yu

Affective Behavior Analysis aims to facilitate technology emotionally smart, creating a world where devices can understand and react to our emotions as humans do.

OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework

no code implementations13 Mar 2024 Wanyun Li, Pinxue Guo, Xinyu Zhou, Lingyi Hong, Yangji He, Xiangyu Zheng, Wei zhang, Wenqiang Zhang

Contemporary Video Object Segmentation (VOS) approaches typically consist stages of feature extraction, matching, memory management, and multiple objects aggregation.

Management Semantic Segmentation +2

Query-guided Prototype Evolution Network for Few-Shot Segmentation

no code implementations11 Mar 2024 Runmin Cong, Hang Xiong, Jinpeng Chen, Wei zhang, Qingming Huang, Yao Zhao

To address this, we present the Query-guided Prototype Evolution Network (QPENet), a new method that integrates query features into the generation process of foreground and background prototypes, thereby yielding customized prototypes attuned to specific queries.

Segmentation

ClickVOS: Click Video Object Segmentation

no code implementations10 Mar 2024 Pinxue Guo, Lingyi Hong, Xinyu Zhou, Shuyong Gao, Wanyun Li, Jinglun Li, Zhaoyu Chen, Xiaoqiang Li, Wei zhang, Wenqiang Zhang

To address these limitations, we propose the setting named Click Video Object Segmentation (ClickVOS) which segments objects of interest across the whole video according to a single click per object in the first frame.

Object Segmentation +3

Aligning Large Language Models for Controllable Recommendations

no code implementations8 Mar 2024 Wensheng Lu, Jianxun Lian, Wei zhang, Guanghua Li, Mingyang Zhou, Hao Liao, Xing Xie

Inspired by the exceptional general intelligence of Large Language Models (LLMs), researchers have begun to explore their application in pioneering the next generation of recommender systems - systems that are conversational, explainable, and controllable.

Recommendation Systems

SMAUG: A Sliding Multidimensional Task Window-Based MARL Framework for Adaptive Real-Time Subtask Recognition

no code implementations4 Mar 2024 Wenjing Zhang, Wei zhang

Instead of making behavioral decisions directly from the exponentially expanding joint observational-action space, subtask-based multi-agent reinforcement learning (MARL) methods enable agents to learn how to tackle different subtasks.

Hierarchical Reinforcement Learning Multi-agent Reinforcement Learning +4

StaPep: an open-source tool for the structure prediction and feature extraction of hydrocarbon-stapled peptides

1 code implementation28 Feb 2024 Zhe Wang, Jianping Wu, Mengjun Zheng, Chenchen Geng, Borui Zhen, Wei zhang, Hui Wu, Zhengyang Xu, Gang Xu, Si Chen, Xiang Li

Many tools exist for extracting structural and physiochemical descriptors from linear peptides to predict their properties, but similar tools for hydrocarbon-stapled peptides are lacking. Here, we present StaPep, a Python-based toolkit designed for generating 2D/3D structures and calculating 21 distinct features for hydrocarbon-stapled peptides. The current version supports hydrocarbon-stapled peptides containing 2 non-standard amino acids (norleucine and 2-aminoisobutyric acid) and 6 nonnatural anchoring residues (S3, S5, S8, R3, R5 and R8). Then we established a hand-curated dataset of 201 hydrocarbon-stapled peptides and 384 linear peptides with sequence information and experimental membrane permeability, to showcase StaPep's application in artificial intelligence projects. A machine learning-based predictor utilizing above calculated features was developed with AUC of 0. 85, for identifying cell-penetrating hydrocarbon-stapled peptides. StaPep's pipeline spans data retrieval, cleaning, structure generation, molecular feature calculation, and machine learning model construction for hydrocarbon-stapled peptides. The source codes and dataset are freely available on Github: https://github. com/dahuilangda/stapep_package.

Retrieval

Lemur: Log Parsing with Entropy Sampling and Chain-of-Thought Merging

1 code implementation28 Feb 2024 Wei zhang, Hongcheng Guo, Anjie Le, Jian Yang, Jiaheng Liu, Zhoujun Li, Tieqiao Zheng, Shi Xu, Runqiang Zang, Liangfan Zheng, Bo Zhang

Log parsing, which entails transforming raw log messages into structured templates, constitutes a critical phase in the automation of log analytics.

Log Parsing

Don't Forget Your Reward Values: Language Model Alignment via Value-based Calibration

1 code implementation25 Feb 2024 Xin Mao, Feng-Lin Li, Huimin Xu, Wei zhang, Anh Tuan Luu

While Reinforcement Learning from Human Feedback (RLHF) significantly enhances the generation quality of Large Language Models (LLMs), recent studies have raised concerns regarding the complexity and instability associated with the Proximal Policy Optimization (PPO) algorithm, proposing a series of order-based calibration methods as viable alternatives.

Language Modelling

Gaussian Process Neural Additive Models

1 code implementation19 Feb 2024 Wei zhang, Brian Barr, John Paisley

Deep neural networks have revolutionized many fields, but their black-box nature also occasionally prevents their wider adoption in fields such as healthcare and finance, where interpretable and explainable models are required.

Additive models Explainable Models

Do Large Language Models Understand Logic or Just Mimick Context?

no code implementations19 Feb 2024 Junbing Yan, Chengyu Wang, Jun Huang, Wei zhang

Over the past few years, the abilities of large language models (LLMs) have received extensive attention, which have performed exceptionally well in complicated scenarios such as logical reasoning and symbolic inference.

counterfactual In-Context Learning +1

Pattern-wise Transparent Sequential Recommendation

no code implementations18 Feb 2024 Kun Ma, Cong Xu, Zeyuan Chen, Wei zhang

However, achieving both model transparency and recommendation performance simultaneously is challenging, especially for models that take the entire sequence of items as input without screening.

Decision Making Sequential Recommendation

Translating Images to Road Network:A Non-Autoregressive Sequence-to-Sequence Approach

2 code implementations13 Feb 2024 Jiachen Lu, Renyuan Peng, Xinyue Cai, Hang Xu, Hongyang Li, Feng Wen, Wei zhang, Li Zhang

Instead, our work establishes a unified representation of both types of data domain by projecting both Euclidean and non-Euclidean data into an integer series called RoadNet Sequence.

Understanding the Role of Cross-Entropy Loss in Fairly Evaluating Large Language Model-based Recommendation

no code implementations9 Feb 2024 Cong Xu, Zhangchi Zhu, Jun Wang, Jianyong Wang, Wei zhang

Large language models (LLMs) have gained much attention in the recommendation community; some studies have observed that LLMs, fine-tuned by the cross-entropy loss with a full softmax, could achieve state-of-the-art performance already.

Language Modelling Large Language Model

EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain

1 code implementation30 Jan 2024 Wei zhang, Miaoxin Cai, Tong Zhang, Yin Zhuang, Xuerui Mao

Multi-modal large language models (MLLMs) have demonstrated remarkable success in vision and visual-language tasks within the natural image domain.

Image Comprehension Instruction Following +2

Fortifying Ethical Boundaries in AI: Advanced Strategies for Enhancing Security in Large Language Models

no code implementations27 Jan 2024 Yunhong He, Jianling Qiu, Wei zhang, Zhengqing Yuan

Recent advancements in large language models (LLMs) have significantly enhanced capabilities in natural language processing and artificial intelligence.

Question Answering Text Generation

Contrastive Learning with Negative Sampling Correction

no code implementations13 Jan 2024 Lu Wang, Chao Du, Pu Zhao, Chuan Luo, Zhangchi Zhu, Bo Qiao, Wei zhang, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang

To correct the negative sampling bias, we propose a novel contrastive learning method named Positive-Unlabeled Contrastive Learning (PUCL).

Contrastive Learning Data Augmentation +2

Knowledge-enhanced Multi-perspective Video Representation Learning for Scene Recognition

no code implementations9 Jan 2024 Xuzheng Yu, Chen Jiang, Wei zhang, Tian Gan, Linlin Chao, Jianan Zhao, Yuan Cheng, Qingpei Guo, Wei Chu

With the explosive growth of video data in real-world applications, a comprehensive representation of videos becomes increasingly important.

Representation Learning Scene Recognition

Holistic Autonomous Driving Understanding by Bird's-Eye-View Injected Multi-Modal Large Models

1 code implementation2 Jan 2024 Xinpeng Ding, Jinahua Han, Hang Xu, Xiaodan Liang, Wei zhang, Xiaomeng Li

BEV-InMLLM integrates multi-view, spatial awareness, and temporal semantics to enhance MLLMs' capabilities on NuInstruct tasks.

Autonomous Driving

Has Your Pretrained Model Improved? A Multi-head Posterior Based Approach

no code implementations2 Jan 2024 Prince Aboagye, Yan Zheng, Junpeng Wang, Uday Singh Saini, Xin Dai, Michael Yeh, Yujie Fan, Zhongfang Zhuang, Shubham Jain, Liang Wang, Wei zhang

The emergence of pre-trained models has significantly impacted Natural Language Processing (NLP) and Computer Vision to relational datasets.

Enhancing dysarthria speech feature representation with empirical mode decomposition and Walsh-Hadamard transform

no code implementations30 Dec 2023 Ting Zhu, Shufei Duan, Camille Dingam, HuiZhi Liang, Wei zhang

This algorithm effectively addresses the challenges of the imbalanced dataset and non-linearity in dysarthric speech and simultaneously provides a robust representation of the local pathological features of the vocal folds and tracts.

imbalanced classification

Symbolic Cognitive Diagnosis via Hybrid Optimization for Intelligent Education Systems

1 code implementation30 Dec 2023 Junhao Shen, Hong Qian, Wei zhang, Aimin Zhou

The SCD framework incorporates the symbolic tree to explicably represent the complicated student-exercise interaction function, and utilizes gradient-based optimization methods to effectively learn the student and exercise parameters.

Attribute cognitive diagnosis

PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with Time-Decoupled Training and Reusable Coop-Diffusion

no code implementations27 Dec 2023 Guansong Lu, Yuanfan Guo, Jianhua Han, Minzhe Niu, Yihan Zeng, Songcen Xu, Zeyi Huang, Zhao Zhong, Wei zhang, Hang Xu

Current large-scale diffusion models represent a giant leap forward in conditional image synthesis, capable of interpreting diverse cues like text, human poses, and edges.

Computational Efficiency Denoising +1

SERF: Fine-Grained Interactive 3D Segmentation and Editing with Radiance Fields

no code implementations26 Dec 2023 Kaichen Zhou, Lanqing Hong, Enze Xie, Yongxin Yang, Zhenguo Li, Wei zhang

Although significant progress has been made in the field of 2D-based interactive editing, fine-grained 3D-based interactive editing remains relatively unexplored.

Interactive Segmentation Segmentation

DMT: Comprehensive Distillation with Multiple Self-supervised Teachers

no code implementations19 Dec 2023 Yuang Liu, Jing Wang, Qiang Zhou, Fan Wang, Jun Wang, Wei zhang

Numerous self-supervised learning paradigms, such as contrastive learning and masked image modeling, have been proposed to acquire powerful and general representations from unlabeled data.

Contrastive Learning Model Compression +1

Design, construction and evaluation of emotional multimodal pathological speech database

no code implementations14 Dec 2023 Ting Zhu, Shufei Duan, HuiZhi Liang, Wei zhang

The automatic recognition tested on speech and glottal data, with average accuracy of 78% for controls and 60% for patients in audio, while 51% for controls and 38% for patients in glottal data, indicating an influence of the disease on emotional expression.

Native Language Identification with Large Language Models

no code implementations13 Dec 2023 Wei zhang, Alexandre Salle

We present the first experiments on Native Language Identification (NLI) using LLMs such as GPT-4.

Language Acquisition Native Language Identification

BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models

1 code implementation5 Dec 2023 Fengyuan Shi, Jiaxi Gu, Hang Xu, Songcen Xu, Wei zhang, LiMin Wang

Now text-to-image foundation models are widely applied to various downstream image synthesis tasks, such as controllable image generation and image editing, while downstream video synthesis tasks are less explored for several reasons.

Image Generation Model Selection +3

Interpretable Knowledge Tracing via Response Influence-based Counterfactual Reasoning

1 code implementation1 Dec 2023 Jiajun Cui, Minghe Yu, Bo Jiang, Aimin Zhou, Jianyong Wang, Wei zhang

Knowledge tracing (KT) plays a crucial role in computer-aided education and intelligent tutoring systems, aiming to assess students' knowledge proficiency by predicting their future performance on new questions based on their past response records.

counterfactual Counterfactual Reasoning +1

Towards Better Parameter-Efficient Fine-Tuning for Large Language Models: A Position Paper

no code implementations22 Nov 2023 Chengyu Wang, Junbing Yan, Wei zhang, Jun Huang

This paper delves into the pressing need in Parameter-Efficient Fine-Tuning (PEFT) for Large Language Models (LLMs).

Model Compression Position

Soft Random Sampling: A Theoretical and Empirical Analysis

no code implementations21 Nov 2023 Xiaodong Cui, Ashish Mittal, Songtao Lu, Wei zhang, George Saon, Brian Kingsbury

Soft random sampling (SRS) is a simple yet effective approach for efficient training of large-scale deep neural networks when dealing with massive data.

Automatic Speech Recognition speech-recognition +1

Multi-Resolution Planar Region Extraction for Uneven Terrains

no code implementations21 Nov 2023 Yinghan Sun, Linfang Zheng, Hua Chen, Wei zhang

This paper studies the problem of extracting planar regions in uneven terrains from unordered point cloud measurements.

Computational Efficiency

MS-Former: Memory-Supported Transformer for Weakly Supervised Change Detection with Patch-Level Annotations

1 code implementation16 Nov 2023 Zhenglai Li, Chang Tang, Xinwang Liu, Changdong Li, Xianju Li, Wei zhang

How to capture the semantic variations associated with the changed and unchanged regions from the patch-level annotations to obtain promising change results is the critical challenge for the weakly supervised change detection task.

Change Detection

Scaling User Modeling: Large-scale Online User Representations for Ads Personalization in Meta

no code implementations16 Nov 2023 Wei zhang, Dai Li, Chen Liang, Fang Zhou, Zhongke Zhang, Xuewei Wang, Ru Li, Yi Zhou, Yaning Huang, Dong Liang, Kai Wang, Zhangyuan Wang, Zhengxing Chen, Min Li, Fenggang Wu, Minghai Chen, Huayu Li, Yunnan Wu, Zhan Shu, Mindi Yuan, Sri Reddy

To address these challenges, we present Scaling User Modeling (SUM), a framework widely deployed in Meta's ads ranking system, designed to facilitate efficient and scalable sharing of online user representation across hundreds of ads models.

Representation Learning

From Complex to Simple: Unraveling the Cognitive Tree for Reasoning with Small Language Models

no code implementations12 Nov 2023 Junbing Yan, Chengyu Wang, Taolin Zhang, Xiaofeng He, Jun Huang, Wei zhang

Reasoning is a distinctive human capacity, enabling us to address complex problems by breaking them down into a series of manageable cognitive steps.

Language Modelling Logical Reasoning

VoxNeRF: Bridging Voxel Representation and Neural Radiance Fields for Enhanced Indoor View Synthesis

no code implementations9 Nov 2023 Sen Wang, Wei zhang, Stefano Gasperini, Shun-Cheng Wu, Nassir Navab

Creating high-quality view synthesis is essential for immersive applications but continues to be problematic, particularly in indoor environments and for real-time deployment.

From Input to Output: A Multi-layer Knowledge Distillation Framework for Compressing Recommendation Models

no code implementations8 Nov 2023 Zhangchi Zhu, Wei zhang

In this paper, we decompose recommendation models into three layers, i. e., the input layer, the intermediate layer, and the output layer, and address deficiencies layer by layer.

Knowledge Distillation

Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation

1 code implementation7 Nov 2023 Ruomeng Ding, Chaoyun Zhang, Lu Wang, Yong Xu, Minghua Ma, Wei zhang, Si Qin, Saravan Rajmohan, QIngwei Lin, Dongmei Zhang

To address these limitations, we introduce a novel thought prompting approach called "Everything of Thoughts" (XoT) to defy the law of "Penrose triangle of existing thought paradigms.

Decision Making

Temporal Treasure Hunt: Content-based Time Series Retrieval System for Discovering Insights

no code implementations5 Nov 2023 Chin-Chia Michael Yeh, Huiyuan Chen, Xin Dai, Yan Zheng, Yujie Fan, Vivian Lai, Junpeng Wang, Audrey Der, Zhongfang Zhuang, Liang Wang, Wei zhang

To facilitate this investigation, we introduce a CTSR benchmark dataset that comprises time series data from a variety of domains, such as motion, power demand, and traffic.

Retrieval Time Series +1

Ego-Network Transformer for Subsequence Classification in Time Series Data

no code implementations5 Nov 2023 Chin-Chia Michael Yeh, Huiyuan Chen, Yujie Fan, Xin Dai, Yan Zheng, Vivian Lai, Junpeng Wang, Zhongfang Zhuang, Liang Wang, Wei zhang, Eamonn Keogh

The ego-networks of all subsequences collectively form a time series subsequence graph, and we introduce an algorithm to efficiently construct this graph.

Time Series Time Series Classification

Time Series Synthesis Using the Matrix Profile for Anonymization

no code implementations5 Nov 2023 Audrey Der, Chin-Chia Michael Yeh, Yan Zheng, Junpeng Wang, Huiyuan Chen, Zhongfang Zhuang, Liang Wang, Wei zhang, Eamonn Keogh

As a result, unmodified data mining tools can obtain near-identical performance on the synthesized time series as on the original time series.

Time Series

Visual Analytics for Efficient Image Exploration and User-Guided Image Captioning

no code implementations2 Nov 2023 Yiran Li, Junpeng Wang, Prince Aboagye, Michael Yeh, Yan Zheng, Liang Wang, Wei zhang, Kwan-Liu Ma

On the one hand, by visually examining the captions automatically generated from language-image models for an image dataset, we gain deeper insights into the semantic underpinnings of the visual contents, unearthing data biases that may be entrenched within the dataset.

Caption Generation Efficient Exploration +1

Lightweight super resolution network for point cloud geometry compression

1 code implementation2 Nov 2023 Wei zhang, Dingquan Li, Ge Li, Wen Gao

This paper presents an approach for compressing point cloud geometry by leveraging a lightweight super-resolution network.

Point cloud reconstruction Point Cloud Super Resolution +1

BM2CP: Efficient Collaborative Perception with LiDAR-Camera Modalities

1 code implementation23 Oct 2023 Binyu Zhao, Wei zhang, Zhaonian Zou

Collaborative perception enables agents to share complementary perceptual information with nearby agents.

Autonomous Driving

Learning Interpretable Rules for Scalable Data Representation and Classification

1 code implementation22 Oct 2023 Zhuo Wang, Wei zhang, Ning Liu, Jianyong Wang

Rule-based models, e. g., decision trees, are widely used in scenarios demanding high model interpretability for their transparent inner structures and good model expressivity.

Classification

Parallel compressive super-resolution imaging with wide field-of-view based on physics enhanced network

no code implementations20 Oct 2023 Xiao-Peng Jin, An-Dong Xiong, Wei zhang, Xiao-Qing Wang, Fan Liu, Chang-Heng Li, Xu-Ri Yao, Xue-Feng Liu, Qing Zhao

By training the network with the prior OTF of an arbitrary 128x128-pixel region and fine-tuning the network with other OTFs within rest regions of FOV, we realize both mask optimization and super-resolution imaging with up to 1020x1500 wide FOV.

Super-Resolution

Explore the Effect of Data Selection on Poison Efficiency in Backdoor Attacks

no code implementations15 Oct 2023 Ziqiang Li, Pengfei Xia, Hong Sun, Yueqi Zeng, Wei zhang, Bin Li

In this study, we focus on improving the poisoning efficiency of backdoor attacks from the sample selection perspective.

Audio Classification Image Classification +2

HI-SLAM: Monocular Real-time Dense Mapping with Hybrid Implicit Fields

no code implementations7 Oct 2023 Wei zhang, Tiecheng Sun, Sen Wang, Qing Cheng, Norbert Haala

For global consistency, we propose an efficient Sim(3)-based pose graph bundle adjustment (PGBA) approach to run online loop closing and mitigate the pose and scale drift.

Simultaneous Localization and Mapping

An Efficient Content-based Time Series Retrieval System

no code implementations5 Oct 2023 Chin-Chia Michael Yeh, Huiyuan Chen, Xin Dai, Yan Zheng, Junpeng Wang, Vivian Lai, Yujie Fan, Audrey Der, Zhongfang Zhuang, Liang Wang, Wei zhang, Jeff M. Phillips

A Content-based Time Series Retrieval (CTSR) system is an information retrieval system for users to interact with time series emerged from multiple domains, such as finance, healthcare, and manufacturing.

Information Retrieval Retrieval +1

Toward a Foundation Model for Time Series Data

no code implementations5 Oct 2023 Chin-Chia Michael Yeh, Xin Dai, Huiyuan Chen, Yan Zheng, Yujie Fan, Audrey Der, Vivian Lai, Zhongfang Zhuang, Junpeng Wang, Liang Wang, Wei zhang

A foundation model is a machine learning model trained on a large and diverse set of data, typically using self-supervised learning-based pre-training techniques, that can be adapted to various downstream tasks.

Self-Supervised Learning Time Series

Feature Interaction Aware Automated Data Representation Transformation

1 code implementation29 Sep 2023 Ehtesamul Azim, Dongjie Wang, Kunpeng Liu, Wei zhang, Yanjie Fu

Creating an effective representation space is crucial for mitigating the curse of dimensionality, enhancing model generalization, addressing data sparsity, and leveraging classical models more effectively.

Automated Feature Engineering Decision Making +4

Revealing the Power of Spatial-Temporal Masked Autoencoders in Multivariate Time Series Forecasting

no code implementations26 Sep 2023 Jiarui Sun, Yujie Fan, Chin-Chia Michael Yeh, Wei zhang, Girish Chowdhary

To address these issues, we propose Spatial-Temporal Masked Autoencoders (STMAE), an MTS forecasting framework that leverages masked autoencoders to enhance the performance of spatial-temporal baseline models.

Multivariate Time Series Forecasting Time Series

Graph-enhanced Optimizers for Structure-aware Recommendation Embedding Evolution

no code implementations24 Sep 2023 Cong Xu, Jun Wang, Jianyong Wang, Wei zhang

Embedding plays a critical role in modern recommender systems because they are virtual representations of real-world entities and the foundation for subsequent decision models.

Recommendation Systems

PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation

1 code implementation21 Sep 2023 Shilin Yan, Xiaohao Xu, Renrui Zhang, Lingyi Hong, Wenchao Chen, Wenqiang Zhang, Wei zhang

Our dataset poses new challenges in panoramic VOS and we hope that our PanoVOS can advance the development of panoramic segmentation/tracking.

Autonomous Driving Segmentation +4

Learning Segment Similarity and Alignment in Large-Scale Content Based Video Retrieval

no code implementations20 Sep 2023 Chen Jiang, Kaiming Huang, Sifeng He, Xudong Yang, Wei zhang, Xiaobo Zhang, Yuan Cheng, Lei Yang, Qing Wang, Furong Xu, Tan Pan, Wei Chu

SSAN is based on two newly proposed modules in video retrieval: (1) An efficient Self-supervised Keyframe Extraction (SKE) module to reduce redundant frame features, (2) A robust Similarity Pattern Detection (SPD) module for temporal alignment.

Retrieval Video Retrieval

Multi-view Fuzzy Representation Learning with Rules based Model

2 code implementations20 Sep 2023 Wei zhang, Zhaohong Deng, Te Zhang, Kup-Sze Choi, Shitong Wang

Second, a new regularization method based on L_(2, 1)-norm regression is proposed to mine the consistency information between views, while the geometric structure of the data is preserved through the Laplacian graph.

Representation Learning

An Empirical Study of Attention Networks for Semantic Segmentation

no code implementations19 Sep 2023 Hao Guo, Hongbiao Si, Guilin Jiang, Wei zhang, Zhiyan Liu, Xuanyi Zhu, xulong Zhang, Yang Liu

What's more, various methods utilize attention in semantic segmentation, but the conclusion of these methods is lacking.

Segmentation Semantic Segmentation

SoccerNet 2023 Challenges Results

2 code implementations12 Sep 2023 Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, Adrien Deliège, Jan Held, Carlos Hinojosa, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdullah Kamal, Adrien Maglo, Albert Clapés, Amr Abdelaziz, Artur Xarles, Astrid Orcesi, Atom Scott, Bin Liu, Byoungkwon Lim, Chen Chen, Fabian Deuser, Feng Yan, Fufu Yu, Gal Shitrit, Guanshuo Wang, Gyusik Choi, Hankyul Kim, Hao Guo, Hasby Fahrudin, Hidenari Koguchi, Håkan Ardö, Ibrahim Salah, Ido Yerushalmy, Iftikar Muhammad, Ikuma Uchida, Ishay Be'ery, Jaonary Rabarisoa, Jeongae Lee, Jiajun Fu, Jianqin Yin, Jinghang Xu, Jongho Nang, Julien Denize, Junjie Li, Junpei Zhang, Juntae Kim, Kamil Synowiec, Kenji Kobayashi, Kexin Zhang, Konrad Habel, Kota Nakajima, Licheng Jiao, Lin Ma, Lizhi Wang, Luping Wang, Menglong Li, Mengying Zhou, Mohamed Nasr, Mohamed Abdelwahed, Mykola Liashuha, Nikolay Falaleev, Norbert Oswald, Qiong Jia, Quoc-Cuong Pham, Ran Song, Romain Hérault, Rui Peng, Ruilong Chen, Ruixuan Liu, Ruslan Baikulov, Ryuto Fukushima, Sergio Escalera, Seungcheon Lee, Shimin Chen, Shouhong Ding, Taiga Someya, Thomas B. Moeslund, Tianjiao Li, Wei Shen, Wei zhang, Wei Li, Wei Dai, Weixin Luo, Wending Zhao, Wenjie Zhang, Xinquan Yang, Yanbiao Ma, Yeeun Joo, Yingsen Zeng, Yiyang Gan, Yongqiang Zhu, Yujie Zhong, Zheng Ruan, Zhiheng Li, Zhijian Huang, Ziyu Meng

More information on the tasks, challenges, and leaderboards are available on https://www. soccer-net. org.

Action Spotting Camera Calibration +3

HiLM-D: Towards High-Resolution Understanding in Multimodal Large Language Models for Autonomous Driving

no code implementations11 Sep 2023 Xinpeng Ding, Jianhua Han, Hang Xu, Wei zhang, Xiaomeng Li

For the first time, we leverage singular multimodal large language models (MLLMs) to consolidate multiple autonomous driving tasks from videos, i. e., the Risk Object Localization and Intention and Suggestion Prediction (ROLISP) task.

Autonomous Driving Object Localization

Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation

no code implementations7 Sep 2023 Jiaxi Gu, Shicong Wang, Haoyu Zhao, Tianyi Lu, Xing Zhang, Zuxuan Wu, Songcen Xu, Wei zhang, Yu-Gang Jiang, Hang Xu

Conditioned on an initial video clip with a small number of frames, additional frames are iteratively generated by reusing the original latent features and following the previous diffusion process.

Action Recognition Denoising +3

Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models

1 code implementation27 Aug 2023 Kaiyuan Gao, Sunan He, Zhenyu He, Jiacheng Lin, Qizhi Pei, Jie Shao, Wei zhang

Generative pre-trained transformer (GPT) models have revolutionized the field of natural language processing (NLP) with remarkable performance in various tasks and also extend their power to multimodal domains.

Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution Networks

1 code implementation22 Aug 2023 Xi Xie, Hongwu Peng, Amit Hasan, Shaoyi Huang, Jiahui Zhao, Haowen Fang, Wei zhang, Tong Geng, Omer Khan, Caiwen Ding

Utilizing these principles, we formulated a kernel for sparse matrix multiplication (SpMM) in GCNs that employs block-level partitioning and combined warp strategy.

Computational Efficiency

DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability

no code implementations ICCV 2023 Runhui Huang, Jianhua Han, Guansong Lu, Xiaodan Liang, Yihan Zeng, Wei zhang, Hang Xu

DiffDis first formulates the image-text discriminative problem as a generative diffusion process of the text embedding from the text encoder conditioned on the image.

Image Generation Zero-Shot Learning

Point-aware Interaction and CNN-induced Refinement Network for RGB-D Salient Object Detection

2 code implementations17 Aug 2023 Runmin Cong, Hongyu Liu, Chen Zhang, Wei zhang, Feng Zheng, Ran Song, Sam Kwong

By integrating complementary information from RGB image and depth map, the ability of salient object detection (SOD) for complex and challenging scenes can be improved.

object-detection RGB-D Salient Object Detection +1

Frequency Perception Network for Camouflaged Object Detection

2 code implementations17 Aug 2023 Runmin Cong, Mengyao Sun, Sanyi Zhang, Xiaofei Zhou, Wei zhang, Yao Zhao

Camouflaged object detection (COD) aims to accurately detect objects hidden in the surrounding environment.

Object object-detection +1

SDDNet: Style-guided Dual-layer Disentanglement Network for Shadow Detection

1 code implementation17 Aug 2023 Runmin Cong, Yuchen Guan, Jinpeng Chen, Wei zhang, Yao Zhao, Sam Kwong

Despite significant progress in shadow detection, current methods still struggle with the adverse impact of background color, which may lead to errors when shadows are present on complex backgrounds.

Disentanglement Shadow Detection

Beyond Semantics: Learning a Behavior Augmented Relevance Model with Self-supervised Learning

1 code implementation10 Aug 2023 Zeyuan Chen, Wei Chen, Jia Xu, Zhongyi Liu, Wei zhang

Drawing inspiration from this, we devise a novel Behavior Augmented Relevance Learning model for Alipay Search (BARL-ASe) that leverages neighbor queries of target item and neighbor items of target query to complement target query-item semantic matching.

Self-Supervised Learning Semantic Similarity +1

Gaussian-based Probabilistic Deep Supervision Network for Noise-Resistant QoS Prediction

no code implementations3 Aug 2023 Ziliang Wang, Xiaohong Zhang, Sheng Huang, Wei zhang, Dan Yang, Meng Yan

Quality of Service (QoS) prediction is an essential task in recommendation systems, where accurately predicting unknown QoS values can improve user satisfaction.

Recommendation Systems

Dynamic Token-Pass Transformers for Semantic Segmentation

no code implementations3 Aug 2023 Yuang Liu, Qiang Zhou, Jing Wang, Fan Wang, Jun Wang, Wei zhang

Vision transformers (ViT) usually extract features via forwarding all the tokens in the self-attention layers from top to toe.

Segmentation Semantic Segmentation

Knowledge-aware Collaborative Filtering with Pre-trained Language Model for Personalized Review-based Rating Prediction

1 code implementation2 Aug 2023 Quanxiu Wang, Xinlei Cao, Jianyong Wang, Wei zhang

For the first issue, to utilize rich knowledge, KCF-PLM develops a transformer network to model the interactions of the extracted aspects w. r. t.

Collaborative Filtering Language Modelling

EmbeddingTree: Hierarchical Exploration of Entity Features in Embedding

no code implementations2 Aug 2023 Yan Zheng, Junpeng Wang, Chin-Chia Michael Yeh, Yujie Fan, Huiyuan Chen, Liang Wang, Wei zhang

The tool helps users discover nuance features of data entities, perform feature denoising/injecting in embedding training, and generate embeddings for unseen entities.

Denoising

Robust Positive-Unlabeled Learning via Noise Negative Sample Self-correction

1 code implementation1 Aug 2023 Zhangchi Zhu, Lu Wang, Pu Zhao, Chao Du, Wei zhang, Hang Dong, Bo Qiao, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang

To mitigate the impact of label uncertainty and improve the robustness of learning with positive and unlabeled data, we propose a new robust PU learning method with a training strategy motivated by the nature of human learning: easy cases should be learned first.

Learning and Evaluating Human Preferences for Conversational Head Generation

no code implementations20 Jul 2023 Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao, Tao Mei

In this paper, we propose a novel learning-based evaluation metric named Preference Score (PS) for fitting human preference according to the quantitative evaluations across different dimensions.

Semi-DETR: Semi-Supervised Object Detection with Detection Transformers

3 code implementations CVPR 2023 Jiacheng Zhang, Xiangru Lin, Wei zhang, Kuo Wang, Xiao Tan, Junyu Han, Errui Ding, Jingdong Wang, Guanbin Li

Specifically, we propose a Stage-wise Hybrid Matching strategy that combines the one-to-many assignment and one-to-one assignment strategies to improve the training efficiency of the first stage and thus provide high-quality pseudo labels for the training of the second stage.

Object object-detection +3

Visual Analytics For Machine Learning: A Data Perspective Survey

no code implementations15 Jul 2023 Junpeng Wang, Shixia Liu, Wei zhang

The past decade has witnessed a plethora of works that leverage the power of visualization (VIS) to interpret machine learning (ML) models.

Contrastive Graph Pooling for Explainable Classification of Brain Networks

1 code implementation7 Jul 2023 Jiaxing Xu, Qingtian Bian, Xinhang Li, Aihu Zhang, Yiping Ke, Miao Qiao, Wei zhang, Wei Khang Jeremy Sim, Balázs Gulyás

Our contributions underscore the potential of ContrastPool for advancing the understanding of brain networks and neurodegenerative conditions.

Classification

CityTrack: Improving City-Scale Multi-Camera Multi-Target Tracking by Location-Aware Tracking and Box-Grained Matching

no code implementations6 Jul 2023 Jincheng Lu, Xipeng Yang, Jin Ye, Yifu Zhang, Zhikang Zou, Wei zhang, Xiao Tan

Targets in urban traffic scenes often undergo occlusion, illumination changes, and perspective changes, making it difficult to associate targets across different cameras accurately.

Interactive Conversational Head Generation

no code implementations5 Jul 2023 Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao

Based on ViCo and ViCo-X, we define three novel tasks targeting the interaction modeling during the face-to-face conversation: 1) responsive listening head generation making listeners respond actively to the speaker with non-verbal signals, 2) expressive talking head generation guiding speakers to be aware of listeners' behaviors, and 3) conversational head generation to integrate the talking/listening ability in one interlocutor.

Sentence Talking Head Generation

Understanding recent deep-learning techniques for identifying collective variables of molecular dynamics

no code implementations1 Jul 2023 Wei zhang, Christof Schütte

High-dimensional metastable molecular system can often be characterised by a few features of the system, i. e. collective variables (CVs).

Deep Equilibrium Multimodal Fusion

no code implementations29 Jun 2023 Jinhong Ni, Yalong Bai, Wei zhang, Ting Yao, Tao Mei

Multimodal fusion integrates the complementary information present in multiple modalities and has gained much attention recently.

Visual Question Answering (VQA)

A Theory of Complex Adaptive Learning Behavior in Complex Adaptive Systems and a Non-Localized Wave Equation in Quantum Mechanics

no code implementations27 Jun 2023 Leilei Shi, Xinshuai Guo, Jiuchang Wei, Wei zhang, Guocheng Wang, Bing-Hong Wang

Keywords: complex adaptive systems, complex adaptive learning, universal law, non-localized wave equation, interactively coherent entanglement, interactively coherent adaptation PACS: 89. 75.-k (Complex Systems); 89. 65. Gh (Economics, Econophysics, Financial Markets, Business and Management); 03. 65. Ud (Entanglement and Quantum Nonlocality)

A Collaborative Transfer Learning Framework for Cross-domain Recommendation

no code implementations26 Jun 2023 Wei zhang, Pengye Zhang, Bo Zhang, Xingxing Wang, Dong Wang

The disadvantage of the former is that the data from other domains is not utilized by a single domain model, while the latter leverage all the data from different domains, but the fine-tuned model of transfer learning may trap the model in a local optimum of the source domain, making it difficult to fit the target domain.

Click-Through Rate Prediction Recommendation Systems +1

FlowFace++: Explicit Semantic Flow-supervised End-to-End Face Swapping

no code implementations22 Jun 2023 Yu Zhang, Hao Zeng, Bowen Ma, Wei zhang, Zhimeng Zhang, Yu Ding, Tangjie Lv, Changjie Fan

The discriminator is shape-aware and relies on a semantic flow-guided operation to explicitly calculate the shape discrepancies between the target and source faces, thus optimizing the face swapping network to generate highly realistic results.

Face Swapping

Visual-Aware Text-to-Speech

no code implementations21 Jun 2023 Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao, Tao Mei

Dynamically synthesizing talking speech that actively responds to a listening head is critical during the face-to-face interaction.

Speech Synthesis

CMLM-CSE: Based on Conditional MLM Contrastive Learning for Sentence Embeddings

no code implementations16 Jun 2023 Wei zhang, Xu Chen

Traditional comparative learning sentence embedding directly uses the encoder to extract sentence features, and then passes in the comparative loss function for learning.

Contrastive Learning Language Modelling +3

PUGAN: Physical Model-Guided Underwater Image Enhancement Using GAN with Dual-Discriminators

1 code implementation15 Jun 2023 Runmin Cong, Wenyu Yang, Wei zhang, Chongyi Li, Chun-Le Guo, Qingming Huang, Sam Kwong

Among existing UIE methods, Generative Adversarial Networks (GANs) based methods perform well in visual aesthetics, while the physical model-based methods have better scene adaptability.

Quantization UIE

ScrollTimes: Tracing the Provenance of Paintings as a Window into History

no code implementations15 Jun 2023 Wei zhang, Wong Kam-Kwai, Yitian Chen, Ailing Jia, Luwei Wang, Jian-Wei Zhang, Lechao Cheng, Huamin Qu, Wei Chen

The study of cultural artifact provenance, tracing ownership and preservation, holds significant importance in archaeology and art history.

Object Detection in Hyperspectral Image via Unified Spectral-Spatial Feature Aggregation

1 code implementation14 Jun 2023 Xiao He, Chang Tang, Xinwang Liu, Wei zhang, Kun Sun, Jiangfeng Xu

S2ADet comprises a hyperspectral information decoupling (HID) module, a two-stream feature extraction network, and a one-stage detection head.

Object object-detection +1

Approximate Maximum-Likelihood RIS-Aided Positioning

no code implementations13 Jun 2023 Wei zhang, Zhenni Wang, Wee Peng Tay

In this paper, we develop a RIS-aided positioning framework to locate a UE in environments where the LOS path may or may not be available.

E2E-LOAD: End-to-End Long-form Online Action Detection

1 code implementation ICCV 2023 Shuqiang Cao, Weixin Luo, Bairui Wang, Wei zhang, Lin Ma

Furthermore, we propose a novel and efficient inference mechanism that accelerates heavy spatial-temporal exploration.

Online Action Detection

NFTVis: Visual Analysis of NFT Performance

no code implementations5 Jun 2023 Fan Yan, Xumeng Wang, Ketian Mao, Wei zhang, Wei Chen

A non-fungible token (NFT) is a data unit stored on the blockchain.

Time Series

PDT: Pretrained Dual Transformers for Time-aware Bipartite Graphs

no code implementations2 Jun 2023 Xin Dai, Yujie Fan, Zhongfang Zhuang, Shubham Jain, Chin-Chia Michael Yeh, Junpeng Wang, Liang Wang, Yan Zheng, Prince Osei Aboagye, Wei zhang

Pre-training on large models is prevalent and emerging with the ever-growing user-generated content in many machine learning application categories.

Contrastive Learning

Can LLMs like GPT-4 outperform traditional AI tools in dementia diagnosis? Maybe, but not today

no code implementations2 Jun 2023 Zhuo Wang, Rongzhen Li, Bowen Dong, Jie Wang, Xiuxing Li, Ning Liu, Chenhui Mao, Wei zhang, Liling Dong, Jing Gao, Jianyong Wang

In this paper, we explore the potential of LLMs such as GPT-4 to outperform traditional AI tools in dementia diagnosis.

Hybrid Driven Learning for Channel Estimation in Intelligent Reflecting Surface Aided Millimeter Wave Communications

no code implementations30 May 2023 Shuntian Zheng, Sheng Wu, Chunxiao Jiang, Wei zhang, Xiaojun Jing

Intelligent reflecting surfaces (IRS) have been proposed in millimeter wave (mmWave) and terahertz (THz) systems to achieve both coverage and capacity enhancement, where the design of hybrid precoders, combiners, and the IRS typically relies on channel state information.

Denoising

MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition

1 code implementation ICCV 2023 Tianlun Zheng, Zhineng Chen, Bingchen Huang, Wei zhang, Yu-Gang Jiang

In this paper, we propose the Incremental MLTR (IMLTR) task in the context of incremental learning (IL), where different languages are introduced in batches.

Continual Learning Incremental Learning +2

MolXPT: Wrapping Molecules with Text for Generative Pre-training

no code implementations18 May 2023 Zequn Liu, Wei zhang, Yingce Xia, Lijun Wu, Shufang Xie, Tao Qin, Ming Zhang, Tie-Yan Liu

Considering that text is the most important record for scientific discovery, in this paper, we propose MolXPT, a unified language model of text and molecules pre-trained on SMILES (a sequence representation of molecules) wrapped by text.

Language Modelling Molecular Property Prediction +3

LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model

3 code implementations28 Apr 2023 Peng Gao, Jiaming Han, Renrui Zhang, Ziyi Lin, Shijie Geng, Aojun Zhou, Wei zhang, Pan Lu, Conghui He, Xiangyu Yue, Hongsheng Li, Yu Qiao

This strategy effectively alleviates the interference between the two tasks of image-text alignment and instruction following and achieves strong multi-modal reasoning with only a small-scale image-text and instruction dataset.

Instruction Following Optical Character Recognition (OCR) +7

SACANet: scene-aware class attention network for semantic segmentation of remote sensing images

1 code implementation22 Apr 2023 Xiaowen Ma, Rui Che, Tingfeng Hong, Mengting Ma, Ziyan Zhao, Tian Feng, Wei zhang

In this paper, we integrate both scene-aware and class attentions to propose a scene-aware class attention network (SACANet) for semantic segmentation of remote sensing images.

Semantic Segmentation

STNet: Spatial and Temporal feature fusion network for change detection in remote sensing images

no code implementations22 Apr 2023 Xiaowen Ma, Jiawei Yang, Tingfeng Hong, Mengting Ma, Ziyan Zhao, Tian Feng, Wei zhang

As an important task in remote sensing image analysis, remote sensing change detection (RSCD) aims to identify changes of interest in a region from spatially co-registered multi-temporal remote sensing images, so as to monitor the local development.

Binary Classification Change Detection

Network Pruning Spaces

no code implementations19 Apr 2023 Xuanyu He, Yu-I Yang, Ran Song, Jiachen Pu, Conggang Hu, Feijun Jiang, Wei zhang, Huanghao Ding

Statistically, the structure of a winning subnetwork guarantees an approximately optimal ratio in this regime.

Network Pruning

DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment

no code implementations CVPR 2023 Lewei Yao, Jianhua Han, Xiaodan Liang, Dan Xu, Wei zhang, Zhenguo Li, Hang Xu

This paper presents DetCLIPv2, an efficient and scalable training framework that incorporates large-scale image-text pairs to achieve open-vocabulary object detection (OVD).

Language Modelling object-detection +1

RSPT: Reconstruct Surroundings and Predict Trajectories for Generalizable Active Object Tracking

no code implementations7 Apr 2023 Fangwei Zhong, Xiao Bi, Yudi Zhang, Wei zhang, Yizhou Wang

However, building a generalizable active tracker that works robustly across different scenarios remains a challenge, especially in unstructured environments with cluttered obstacles and diverse layouts.

Autonomous Driving Object Tracking

HS-Pose: Hybrid Scope Feature Extraction for Category-level Object Pose Estimation

1 code implementation CVPR 2023 Linfang Zheng, Chen Wang, Yinghan Sun, Esha Dasgupta, Hua Chen, Ales Leonardis, Wei zhang, Hyung Jin Chang

In this paper, we focus on the problem of category-level object pose estimation, which is challenging due to the large intra-category shape variation.

Pose Estimation Translation

ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every Detection Box

no code implementations27 Mar 2023 Yifu Zhang, Xinggang Wang, Xiaoqing Ye, Wei zhang, Jincheng Lu, Xiao Tan, Errui Ding, Peize Sun, Jingdong Wang

We propose a hierarchical data association strategy to mine the true objects in low-score detection boxes, which alleviates the problems of object missing and fragmented trajectories.

3D Multi-Object Tracking motion prediction +1

Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection

1 code implementation CVPR 2023 Chang Liu, Weiming Zhang, Xiangru Lin, Wei zhang, Xiao Tan, Junyu Han, Xiaomao Li, Errui Ding, Jingdong Wang

It employs a "divide-and-conquer" strategy and separately exploits positives for the classification and localization task, which is more robust to the assignment ambiguity.

Dense Object Detection Object +3

How Does Attention Work in Vision Transformers? A Visual Analytics Attempt

no code implementations24 Mar 2023 Yiran Li, Junpeng Wang, Xin Dai, Liang Wang, Chin-Chia Michael Yeh, Yan Zheng, Wei zhang, Kwan-Liu Ma

Multi-head self-attentions are then applied to the sequence to learn the attention between patches.

Multi-modal Facial Affective Analysis based on Masked Autoencoder

no code implementations20 Mar 2023 Wei zhang, Bowen Ma, Feng Qiu, Yu Ding

The CVPR 2023 Competition on Affective Behavior Analysis in-the-wild (ABAW) is dedicated to providing high-quality and large-scale Aff-wild2 for the recognition of commonly used emotion representations, such as Action Units (AU), basic expression categories(EXPR), and Valence-Arousal (VA).

SpatialFormer: Semantic and Target Aware Attentions for Few-Shot Learning

1 code implementation15 Mar 2023 Jinxiang Lai, Siqian Yang, Wenlong Wu, Tao Wu, Guannan Jiang, Xi Wang, Jun Liu, Bin-Bin Gao, Wei zhang, Yuan Xie, Chengjie Wang

Then we derive two specific attention modules, named SpatialFormer Semantic Attention (SFSA) and SpatialFormer Target Attention (SFTA), to enhance the target object regions while reduce the background distraction.

Few-Shot Learning

LoG-CAN: local-global Class-aware Network for semantic segmentation of remote sensing images

1 code implementation14 Mar 2023 Xiaowen Ma, Mengting Ma, Chenlu Hu, Zhiyuan Song, Ziyan Zhao, Tian Feng, Wei zhang

We present LoG-CAN, a multi-scale semantic segmentation network with a global class-aware (GCA) module and local class-aware (LCA) modules to remote sensing images.

Segmentation Semantic Segmentation

CapDet: Unifying Dense Captioning and Open-World Detection Pretraining

no code implementations CVPR 2023 Yanxin Long, Youpeng Wen, Jianhua Han, Hang Xu, Pengzhen Ren, Wei zhang, Shen Zhao, Xiaodan Liang

Besides, our CapDet also achieves state-of-the-art performance on dense captioning tasks, e. g., 15. 44% mAP on VG V1. 2 and 13. 98% on the VG-COCO dataset.

Dense Captioning

Joint Task and Data Oriented Semantic Communications: A Deep Separate Source-channel Coding Scheme

no code implementations27 Feb 2023 Jianhao Huang, Dongxu Li, Chuan Huang, Xiaoqi Qin, Wei zhang

This paper proposes a deep separate source-channel coding (DSSCC) framework for the joint task and data oriented semantic communications (JTD-SC) and utilizes the variational autoencoder approach to solve the rate-distortion problem with semantic distortion.

Bayesian Inference Data Compression

Entity-Level Text-Guided Image Manipulation

1 code implementation22 Feb 2023 Yikai Wang, Jianan Wang, Guansong Lu, Hang Xu, Zhenguo Li, Wei zhang, Yanwei Fu

In the image manipulation phase, SeMani adopts a generative model to synthesize new images conditioned on the entity-irrelevant regions and target text descriptions.

Denoising Image Manipulation

Simulation-to-reality UAV Fault Diagnosis with Deep Learning

no code implementations9 Feb 2023 Wei zhang, Junjie Tong, Fang Liao, Yunfeng Zhang

Accurate diagnosis of propeller faults is crucial for ensuring the safe and efficient operation of quadrotors.

Domain Adaptation

RIS-Position and Orientation Estimation in MIMO-OFDM Systems with Practical Scatterers

no code implementations9 Feb 2023 Sheng Hong, Minghui Li, Cunhua Pan, Marco Di Renzo, Wei zhang, Lajos Hanzo

A two-step positioning scheme is exploited, where the channel parameters are first acquired, and the position-related parameters are then estimated.

Position

Language-Driven Anchors for Zero-Shot Adversarial Robustness

1 code implementation30 Jan 2023 Xiao Li, Wei zhang, Yining Liu, Zhanhao Hu, Bo Zhang, Xiaolin Hu

Previous researches mainly focus on improving adversarial robustness in the fully supervised setting, leaving the challenging domain of zero-shot adversarial robustness an open question.

Adversarial Defense Adversarial Robustness +3

Deep-learning-based on-chip rapid spectral imaging with high spatial resolution

no code implementations16 Jan 2023 Jiawei Yang, Kaiyu Cui, Yidong Huang, Wei zhang, Xue Feng, Fang Liu

Spectral imaging extends the concept of traditional color cameras to capture images across multiple spectral channels and has broad application prospects.

Autonomous Driving Metamerism +1

EPR-Net: Constructing non-equilibrium potential landscape via a variational force projection formulation

1 code implementation5 Jan 2023 Yue Zhao, Wei zhang, Tiejun Li

We present EPR-Net, a novel and effective deep learning approach that tackles a crucial challenge in biophysics: constructing potential landscapes for high-dimensional non-equilibrium steady-state (NESS) systems.

Dimensionality Reduction

Machine Learning for Large-Scale Optimization in 6G Wireless Networks

no code implementations3 Jan 2023 Yandong Shi, Lixiang Lian, Yuanming Shi, Zixin Wang, Yong Zhou, Liqun Fu, Lin Bai, Jun Zhang, Wei zhang

The sixth generation (6G) wireless systems are envisioned to enable the paradigm shift from "connected things" to "connected intelligence", featured by ultra high density, large-scale, dynamic heterogeneity, diversified functional requirements and machine learning capabilities, which leads to a growing need for highly efficient intelligent algorithms.

Computational Efficiency Distributed Optimization +2

CFCG: Semi-Supervised Semantic Segmentation via Cross-Fusion and Contour Guidance Supervision

no code implementations ICCV 2023 Shuo Li, Yue He, Weiming Zhang , Wei zhang, Xiao Tan, Junyu Han, Errui Ding, Jingdong Wang

Current state-of-the-art semi-supervised semantic segmentation (SSSS) methods typically adopt pseudo labeling and consistency regularization between multiple learners with different perturbations.

Semi-Supervised Semantic Segmentation

Translating Images to Road Network: A Non-Autoregressive Sequence-to-Sequence Approach

no code implementations ICCV 2023 Jiachen Lu, Renyuan Peng, Xinyue Cai, Hang Xu, Hongyang Li, Feng Wen, Wei zhang, Li Zhang

The extraction of road network is essential for the generation of high-definition maps since it enables the precise localization of road landmarks and their interconnections.

Data-free Knowledge Distillation for Fine-grained Visual Categorization

1 code implementation ICCV 2023 Renrong Shao, Wei zhang, Jianhua Yin, Jun Wang

Our approach utilizes an adversarial distillation framework with attention generator, mixed high-order attention distillation, and semantic feature contrast learning.

Data-free Knowledge Distillation Fine-Grained Visual Categorization +1

WaterMask: Instance Segmentation for Underwater Imagery

1 code implementation ICCV 2023 Shijie Lian, Hua Li, Runmin Cong, Suqi Li, Wei zhang, Sam Kwong

Underwater image instance segmentation is a fundamental and critical step in underwater image analysis and understanding.

2D Object Detection Graph Attention +3

A Deep Learning Method for Real-time Bias Correction of Wind Field Forecasts in the Western North Pacific

no code implementations29 Dec 2022 Wei zhang, Yueyue Jiang, Junyu Dong, Xiaojiang Song, Renbo Pang, Boyu Guoan, Hui Yu

In this study, we developed the Multi-Task-Double Encoder Trajectory Gated Recurrent Unit (MT-DETrajGRU) model, which uses an improved double-encoder forecaster architecture to model the spatiotemporal sequence of the U and V components of the wind field; we designed a multi-task learning loss function to correct wind speed and wind direction simultaneously using only one model.

Multi-Task Learning

Circular Accessible Depth: A Robust Traversability Representation for UGV Navigation

no code implementations28 Dec 2022 Shikuan Xie, Ran Song, Yuenan Zhao, Xueqin Huang, Yibin Li, Wei zhang

In this paper, we present the Circular Accessible Depth (CAD), a robust traversability representation for an unmanned ground vehicle (UGV) to learn traversability in various scenarios containing irregular obstacles.

Semantic optical fiber communication system

no code implementations27 Dec 2022 Zhenming Yu, Hongyu Huang, Liming Cheng, Wei zhang, Yueqiu Mu, Kun Xu

The current optical communication systems minimize bit or symbol errors without considering the semantic meaning behind digital bits, thus transmitting a lot of unnecessary information.

Differentiating Student Feedbacks for Knowledge Tracing

no code implementations16 Dec 2022 Jiajun Cui, Wei zhang

In computer-aided education and intelligent tutoring systems, knowledge tracing (KT) raises attention due to the development of data-driven learning methods, which aims to predict students' future performance given their past question response sequences to trace their knowledge states.

Knowledge Tracing

Adaptive Low-Precision Training for Embeddings in Click-Through Rate Prediction

no code implementations12 Dec 2022 Shiwei Li, Huifeng Guo, Lu Hou, Wei zhang, Xing Tang, Ruiming Tang, Rui Zhang, Ruixuan Li

To this end, we formulate a novel quantization training paradigm to compress the embeddings from the training stage, termed low-precision training (LPT).

Click-Through Rate Prediction Quantization

Low-rank Tensor Assisted K-space Generative Model for Parallel Imaging Reconstruction

no code implementations11 Dec 2022 Wei zhang, Zengwei Xiao, Hui Tao, Minghui Zhang, Xiaoling Xu, Qiegen Liu

Although recent deep learning methods, especially generative models, have shown good performance in fast magnetic resonance imaging, there is still much room for improvement in high-dimensional generation.

G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks

1 code implementation7 Dec 2022 Zhongwei Wan, Yichun Yin, Wei zhang, Jiaxin Shi, Lifeng Shang, Guangyong Chen, Xin Jiang, Qun Liu

Recently, domain-specific PLMs have been proposed to boost the task performance of specific domains (e. g., biomedical and computer science) by continuing to pre-train general PLMs with domain-specific corpora.

General Knowledge Language Modelling +3

FlowFace: Semantic Flow-guided Shape-aware Face Swapping

no code implementations6 Dec 2022 Hao Zeng, Wei zhang, Changjie Fan, Tangjie Lv, Suzhen Wang, Zhimeng Zhang, Bowen Ma, Lincheng Li, Yu Ding, Xin Yu

Unlike most previous methods that focus on transferring the source inner facial features but neglect facial contours, our FlowFace can transfer both of them to a target face, thus leading to more realistic face swapping.

Face Swapping

Quantized Wasserstein Procrustes Alignment of Word Embedding Spaces

no code implementations AMTA 2022 Prince O Aboagye, Yan Zheng, Michael Yeh, Junpeng Wang, Zhongfang Zhuang, Huiyuan Chen, Liang Wang, Wei zhang, Jeff Phillips

Optimal Transport (OT) provides a useful geometric framework to estimate the permutation matrix under unsupervised cross-lingual word embedding (CLWE) models that pose the alignment task as a Wasserstein-Procrustes problem.

Bilingual Lexicon Induction Quantization

3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation

no code implementations2 Dec 2022 Zutao Jiang, Guansong Lu, Xiaodan Liang, Jihua Zhu, Wei zhang, Xiaojun Chang, Hang Xu

Here, we make the first attempt to achieve generic text-guided cross-category 3D object generation via a new 3D-TOGO model, which integrates a text-to-views generation module and a views-to-3D generation module.

3D Generation Contrastive Learning +2

Global Meets Local: Effective Multi-Label Image Classification via Category-Aware Weak Supervision

no code implementations23 Nov 2022 Jiawei Zhan, Jun Liu, Wei Tang, Guannan Jiang, Xi Wang, Bin-Bin Gao, Tianliang Zhang, Wenlong Wu, Wei zhang, Chengjie Wang, Yuan Xie

This paper builds a unified framework to perform effective noisy-proposal suppression and to interact between global and local features for robust feature learning.

Feature Correlation Multi-Label Image Classification

RIS-Assisted Self-Interference Mitigation for In-Band Full-Duplex Transceivers

no code implementations22 Nov 2022 Wei zhang, Yi Jiang, Bin Zhou

The wireless in-band full-duplex (IBFD) technology can in theory double the system capacity over the conventional frequency division duplex (FDD) or time-division duplex (TDD) alternatives.

Quantization

Leveraging the Hints: Adaptive Bidding in Repeated First-Price Auctions

no code implementations5 Nov 2022 Wei zhang, Yanjun Han, Zhengyuan Zhou, Aaron Flores, Tsachy Weissman

In the past four years, a particularly important development in the digital advertising industry is the shift from second-price auctions to first-price auctions for online display ads.

Marketing

Rethinking the Metric in Few-shot Learning: From an Adaptive Multi-Distance Perspective

no code implementations2 Nov 2022 Jinxiang Lai, Siqian Yang, Guannan Jiang, Xi Wang, Yuxi Li, Zihui Jia, Xiaochen Chen, Jun Liu, Bin-Bin Gao, Wei zhang, Yuan Xie, Chengjie Wang

In this paper, for the first time, we investigate the contributions of different distance metrics, and propose an adaptive fusion scheme, bringing significant improvements in few-shot classification.

Few-Shot Learning

Facial Action Unit Detection and Intensity Estimation from Self-supervised Representation

no code implementations28 Oct 2022 Bowen Ma, Rudong An, Wei zhang, Yu Ding, Zeng Zhao, Rongsheng Zhang, Tangjie Lv, Changjie Fan, Zhipeng Hu

As a fine-grained and local expression behavior measurement, facial action unit (FAU) analysis (e. g., detection and intensity estimation) has been documented for its time-consuming, labor-intensive, and error-prone annotation.

Action Unit Detection Facial Action Unit Detection

Global-to-local Expression-aware Embeddings for Facial Action Unit Detection

no code implementations27 Oct 2022 Rudong An, Wei zhang, Hao Zeng, Wei Chen, Zhigang Deng, Yu Ding

Then, AU feature maps and their corresponding AU masks are multiplied to generate AU masked features focusing on local facial region.

Action Unit Detection Facial Action Unit Detection

Facial Action Units Detection Aided by Global-Local Expression Embedding

no code implementations25 Oct 2022 Zhipeng Hu, Wei zhang, Lincheng Li, Yu Ding, Wei Chen, Zhigang Deng, Xin Yu

We find that AUs and facial expressions are highly associated, and existing facial expression datasets often contain a large number of identities.

3D Face Reconstruction

Learning Point-Language Hierarchical Alignment for 3D Visual Grounding

1 code implementation22 Oct 2022 Jiaming Chen, Weixin Luo, Ran Song, Xiaolin Wei, Lin Ma, Wei zhang

This paper presents a novel hierarchical alignment model (HAM) that learns multi-granularity visual and linguistic representations in an end-to-end manner.

Sentence Visual Grounding +1

Slippage-robust Gaze Tracking for Near-eye Display

no code implementations20 Oct 2022 Wei zhang, Jiaxi Cao, Xiang Wang, Enqi Tian, Bin Li

In recent years, head-mounted near-eye display devices have become the key hardware foundation for virtual reality and augmented reality.

ISTA-Inspired Network for Image Super-Resolution

no code implementations14 Oct 2022 Yuqing Liu, Wei zhang, Weifeng Sun, Zhikai Yu, Jianfeng Wei, Shengquan Li

Inspired by the mathematical analysis, the ISTA block is developed to conduct the optimization in an end-to-end manner.

Image Super-Resolution

Repainting and Imitating Learning for Lane Detection

no code implementations11 Oct 2022 Yue He, Minyue Jiang, Xiaoqing Ye, Liang Du, Zhikang Zou, Wei zhang, Xiao Tan, Errui Ding

In this paper, we target at finding an enhanced feature space where the lane features are distinctive while maintaining a similar distribution of lanes in the wild.

Lane Detection

SoccerNet 2022 Challenges Results

7 code implementations5 Oct 2022 Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao, Chengzhi Lin, Cheuk-Yiu Chan, Chun Chuen Hui, Dengjie Li, Fan Yang, Fan Liang, Fang Da, Feng Yan, Fufu Yu, Guanshuo Wang, H. Anthony Chan, He Zhu, Hongwei Kan, Jiaming Chu, Jianming Hu, Jianyang Gu, Jin Chen, João V. B. Soares, Jonas Theiner, Jorge De Corte, José Henrique Brito, Jun Zhang, Junjie Li, Junwei Liang, Leqi Shen, Lin Ma, Lingchi Chen, Miguel Santos Marques, Mike Azatov, Nikita Kasatkin, Ning Wang, Qiong Jia, Quoc Cuong Pham, Ralph Ewerth, Ran Song, RenGang Li, Rikke Gade, Ruben Debien, Runze Zhang, Sangrok Lee, Sergio Escalera, Shan Jiang, Shigeyuki Odashima, Shimin Chen, Shoichi Masui, Shouhong Ding, Sin-wai Chan, Siyu Chen, Tallal El-Shabrawy, Tao He, Thomas B. Moeslund, Wan-Chi Siu, Wei zhang, Wei Li, Xiangwei Wang, Xiao Tan, Xiaochuan Li, Xiaolin Wei, Xiaoqing Ye, Xing Liu, Xinying Wang, Yandong Guo, YaQian Zhao, Yi Yu, YingYing Li, Yue He, Yujie Zhong, Zhenhua Guo, Zhiheng Li

The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team.

Action Spotting Camera Calibration +3

Collaboration of Pre-trained Models Makes Better Few-shot Learner

no code implementations25 Sep 2022 Renrui Zhang, Bohao Li, Wei zhang, Hao Dong, Hongsheng Li, Peng Gao, Yu Qiao

In this paper, we propose CoMo, a Collaboration of pre-trained Models that incorporates diverse prior knowledge from various pre-training paradigms for better few-shot learning.

Few-Shot Learning Representation Learning

DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection

no code implementations20 Sep 2022 Lewei Yao, Jianhua Han, Youpeng Wen, Xiaodan Liang, Dan Xu, Wei zhang, Zhenguo Li, Chunjing Xu, Hang Xu

We further design a concept dictionary~(with descriptions) from various online sources and detection datasets to provide prior knowledge for each concept.

object-detection Open World Object Detection

Provably Uncertainty-Guided Universal Domain Adaptation

no code implementations19 Sep 2022 Yifan Wang, Lin Zhang, Ran Song, Paul L. Rosin, Yibin Li, Wei zhang

It fully utilizes the relationship between a target sample and its neighbors in the source domain to avoid the influence of domain misalignment.

Universal Domain Adaptation Unsupervised Domain Adaptation

SENDER: SEmi-Nonlinear Deep Efficient Reconstructor for Extraction Canonical, Meta, and Sub Functional Connectivity in the Human Brain

no code implementations12 Sep 2022 Wei zhang, Yu Bao

Deep Linear and Nonlinear learning methods have already been vital machine learning methods for investigating the hierarchical features such as functional connectivity in the human brain via functional Magnetic Resonance signals; however, there are three major shortcomings: 1).

A detail-enhanced sampling strategy in Hadamard single-pixel imaging

no code implementations9 Sep 2022 Yan Cai, Shijian Li, Wei zhang, Hao Wu, Xu-Ri Yao, Qing Zhao

Hadamard single-pixel imaging (HSI) is an appealing imaging technique due to its features of low hardware complexity and industrial cost.

Image Reconstruction

Dual Representation Learning for One-Step Clustering of Multi-View Data

1 code implementation30 Aug 2022 Wei zhang, Zhaohong Deng, Kup-Sze Choi, Jun Wang, Shitong Wang

Meanwhile, to make the representation learning more specific to the clustering task, a one-step learning framework is proposed to integrate representation learning and clustering partition as a whole.

Clustering Representation Learning

Robustness to Unbounded Smoothness of Generalized SignSGD

no code implementations23 Aug 2022 Michael Crawshaw, Mingrui Liu, Francesco Orabona, Wei zhang, Zhenxun Zhuang

We also compare these algorithms with popular optimizers on a set of deep learning tasks, observing that we can match the performance of Adam while beating the others.

Cannot find the paper you are looking for? You can Submit a new open access paper.