no code implementations • 9 Apr 2024 • Aimin Tang, Xudong Wang, J. Andrew Zhang
In this article, we provide an overview of full-duplex ISAC (FD-ISAC), where a full-duplex radio is used for both wireless sensing and full-duplex communications in B5G/6G networks, with a focus on the fundamental interference management problem in such networks.
1 code implementation • 5 Feb 2024 • Xudong Wang, Trevor Darrell, Sai Saketh Rambhatla, Rohit Girdhar, Ishan Misra
Text-to-image diffusion models produce high quality images but do not offer control over individual instances in the image.
Ranked #3 on Conditional Text-to-Image Synthesis on COCO-MIG
1 code implementation • 25 Jan 2024 • Letian Fu, Long Lian, Renhao Wang, Baifeng Shi, Xudong Wang, Adam Yala, Trevor Darrell, Alexei A. Efros, Ken Goldberg
In this work, we re-examine inter-patch dependencies in the decoding mechanism of masked autoencoders (MAE).
1 code implementation • 28 Dec 2023 • Dantong Niu, Xudong Wang, Xinyang Han, Long Lian, Roei Herzig, Trevor Darrell
Several unsupervised image segmentation approaches have been proposed which eliminate the need for dense manually-annotated segmentation masks; current models separately handle either semantic segmentation (e. g., STEGO) or class-agnostic instance segmentation (e. g., CutLER), but not both (i. e., panoptic segmentation).
Ranked #1 on Unsupervised Panoptic Segmentation on COCO val2017
no code implementations • 24 Dec 2023 • Chun-Hsiao Yeh, Xudong Wang, Stella X. Yu, Charles Hill, Zackery Steck, Scott Kangas, Aaron Reite
Deep learning has had remarkable success at analyzing handheld imagery such as consumer photos due to the availability of large-scale human annotations (e. g., ImageNet).
no code implementations • 13 Dec 2023 • Tsung-Han Wu, Giscard Biamby, David Chan, Lisa Dunlap, Ritwik Gupta, Xudong Wang, Joseph E. Gonzalez, Trevor Darrell
Current open-source Large Multimodal Models (LMMs) excel at tasks such as open-vocabulary language grounding and segmentation but can suffer under false premises when queries imply the existence of something that is not actually present in the image.
no code implementations • 15 Nov 2023 • Xudong Wang, Li Niu, Junyan Cao, Yan Hong, Liqing Zhang
In this work, we employ adversarial learning to bridge the domain gap between foreground feature map and background feature map.
no code implementations • 8 Oct 2023 • Zhifeng Hu, Chong Han, Xudong Wang
Furthermore, a DRL based resource allocation algorithm is developed to realize long-term RE maximization and fast recovery from broken links.
1 code implementation • 28 Aug 2023 • Xudong Wang, Ishan Misra, Ziyun Zeng, Rohit Girdhar, Trevor Darrell
Existing approaches to unsupervised video instance segmentation typically rely on motion estimates and experience difficulties tracking small or divergent motions.
1 code implementation • NeurIPS 2023 • Xudong Wang, Shufan Li, Konstantinos Kallidromitis, Yusuke Kato, Kazuki Kozuka, Trevor Darrell
Open-vocabulary image segmentation aims to partition an image into semantic regions according to arbitrary text descriptions.
Ranked #1 on Image Segmentation on Pascal Panoptic Parts
no code implementations • 18 Jun 2023 • Guangbu Liu, Tong Zhang, Xudong Wang, Wenting Zhao, Chuanwei Zhou, Zhen Cui
Instead of a plain use of a base graph dictionary, we propose the variational graph dictionary adaptation (VGDA) to generate a personalized dictionary (named adapted graph dictionary) for catering to each input graph.
1 code implementation • ICCV 2023 • Li Lyna Zhang, Xudong Wang, Jiahang Xu, Quanlu Zhang, Yujing Wang, Yuqing Yang, Ningxin Zheng, Ting Cao, Mao Yang
The combination of Neural Architecture Search (NAS) and quantization has proven successful in automatically designing low-FLOPs INT8 quantized neural networks (QNN).
2 code implementations • CVPR 2023 • Xudong Wang, Rohit Girdhar, Stella X. Yu, Ishan Misra
We propose Cut-and-LEaRn (CutLER), a simple approach for training unsupervised object detection and segmentation models.
Ranked #1 on Unsupervised Instance Segmentation on UVO
no code implementations • 14 Sep 2022 • Zesong Qiu, Yuwei Li, Dongming He, Qixuan Zhang, Longwen Zhang, Yinghao Zhang, Jingya Wang, Lan Xu, Xudong Wang, Yuyao Zhang, Jingyi Yu
Named after the fossils of one of the oldest known human ancestors, our LUCY dataset contains high-quality Computed Tomography (CT) scans of the complete human head before and after orthognathic surgeries, critical for evaluating surgery results.
no code implementations • 10 Aug 2022 • Kaitao Song, Teng Wan, Bixia Wang, Huiqiang Jiang, Luna Qiu, Jiahang Xu, Liping Jiang, Qun Lou, Yuqing Yang, Dongsheng Li, Xudong Wang, Lili Qiu
Specifically, we first pre-train an encoder-decoder framework in an automatic speech recognition (ASR) objective by using speech-to-text dataset, and then fine-tune ASR encoder on the cleft palate dataset for hypernasality estimation.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • CVPR 2022 • Tsung-Wei Ke, Jyh-Jing Hwang, Yunhui Guo, Xudong Wang, Stella X. Yu
We enforce spatial consistency of grouping and bootstrap feature learning with co-segmentation among multiple views of the same image, and enforce semantic consistency across the grouping hierarchy with clustering transformers between coarse- and fine-grained features.
1 code implementation • CVPR 2022 • Xudong Wang, Zhirong Wu, Long Lian, Stella X. Yu
Our key insight is that pseudo-labels are naturally imbalanced due to intrinsic data similarity, even when a model is trained on balanced source data and evaluated on balanced target data.
Ranked #1 on Few-Shot Image Classification on ImageNet - 0-Shot (using extra training data)
no code implementations • 8 Oct 2021 • Xudong Wang, Luis Miranda-Moreno, Lijun Sun
We treat the raw data with anomalies as a multivariate time series matrix (location $\times$ time) and assume the denoised matrix has a low-rank structure.
1 code implementation • 6 Oct 2021 • Xudong Wang, Long Lian, Stella X. Yu
Intuitively, no matter what the downstream task is, instances to be labeled must be representative and diverse: The former would facilitate label propagation to unlabeled data, whereas the latter would ensure coverage of the entire dataset.
Active Learning Semi-Supervised Image Classification (Cold Start)
no code implementations • 29 Sep 2021 • Xiaochen Zhou, Yuchuan Tian, Xudong Wang
Moreover, to prevent the compact model from forgetting the knowledge of the source data during knowledge distillation, a collaborative knowledge distillation (Co-KD) method is developed to unify the source data on the server and the target data on the edge device to train the compact model.
1 code implementation • CVPR 2022 • Yunhui Guo, Xudong Wang, Yubei Chen, Stella X. Yu
Hyperbolic space can naturally embed hierarchies, unlike Euclidean space.
no code implementations • 7 Jun 2021 • Yu Zhang, Guoming Tang, Qianyi Huang, Yi Wang, Xudong Wang, Jiadong Lou
Non-intrusive load monitoring (NILM) helps disaggregate the household's main electricity consumption to energy usages of individual appliances, thus greatly cutting down the cost in fine-grained household load monitoring.
1 code implementation • 21 May 2021 • Xudong Wang, Yuankai Wu, Dingyi Zhuang, Lijun Sun
This paper studies the traffic state estimation (TSE) problem using sparse observations from mobile sensors.
no code implementations • CVPR 2021 • Xudong Wang, Long Lian, Stella X. Yu
Existing methods focus on training an RL policy that is universal to changing visual domains, whereas we focus on extracting visual foreground that is universal, feeding clean invariant vision to the RL policy learner.
no code implementations • 1 Jan 2021 • Xiaochen Zhou, Xudong Wang
Theoretical analysis shows: 1) DSPGD with CEM converges with an $O(1/T)$ rate, where $T$ is the number of iterations; 2) the communication cost of DSPGD with CEM is unrelated to the number of data samples; 3) the clustering loss of the federated kernel $k$-means can approach that of the centralized kernel $k$-means.
2 code implementations • ICLR 2021 • Xudong Wang, Long Lian, Zhongqi Miao, Ziwei Liu, Stella X. Yu
We take a dynamic view of the training data and provide a principled model bias and variance analysis as the training data fluctuates: Existing long-tail classifiers invariably increase the model variance and the head-tail model bias gap remains large, due to more and larger confusion with hard negatives for the tail.
Ranked #22 on Long-tail Learning on iNaturalist 2018
1 code implementation • 25 Sep 2020 • Xudong Wang, Stella X. Yu
The concept of TBC can also be extended to group convolution and fully connected layers, and can be applied to various backbone networks and attention modules.
2 code implementations • CVPR 2021 • Xudong Wang, Ziwei Liu, Stella X. Yu
Unsupervised feature learning has made great strides with contrastive learning based on instance discrimination and invariant mapping, as benchmarked on curated class-balanced datasets.
Contrastive Learning Semi-Supervised Image Classification +2
no code implementations • 4 Apr 2020 • Xudong Wang, Shizhong Han, Yunqiang Chen, Dashan Gao, Nuno Vasconcelos
A volumetric attention(VA) module for 3D medical image segmentation and detection is proposed.
1 code implementation • 30 Jan 2020 • Liang Zhang, Xudong Wang, Hongsheng Li, Guangming Zhu, Peiyi Shen, Ping Li, Xiaoyuan Lu, Syed Afaq Ali Shah, Mohammed Bennamoun
To solve these problems mentioned above, we propose a novel graph self-adaptive pooling method with the following objectives: (1) to construct a reasonable pooled graph topology, structure and feature information of the graph are considered simultaneously, which provide additional veracity and objectivity in node selection; and (2) to make the pooled nodes contain sufficiently effective graph information, node feature information is aggregated before discarding the unimportant nodes; thus, the selected nodes contain information from neighbor nodes, which can enhance the use of features of the unselected nodes.
1 code implementation • CVPR 2019 • Xudong Wang, Zhaowei Cai, Dashan Gao, Nuno Vasconcelos
Experiments, on a newly established universal object detection benchmark of 11 diverse datasets, show that the proposed detector outperforms a bank of individual detectors, a multi-domain detector, and a baseline universal detector, with a 1. 3x parameter increase over a single-domain baseline detector.
no code implementations • CVPR 2018 • Bo Liu, Xudong Wang, Mandar Dixit, Roland Kwitt, Nuno Vasconcelos
A new architecture, denoted the FeATure TransfEr Network (FATTEN), is proposed for the modeling of feature trajectories induced by variations of object pose.