no code implementations • 23 Apr 2024 • Guoqing Wang, Zhongdao Wang, Pin Tang, Jilai Zheng, Xiangxuan Ren, Bailan Feng, Chao Ma
Existing solutions for 3D semantic occupancy prediction typically treat the task as a one-shot 3D voxel-wise segmentation perception problem.
no code implementations • 15 Apr 2024 • Pin Tang, Zhongdao Wang, Guoqing Wang, Jilai Zheng, Xiangxuan Ren, Bailan Feng, Chao Ma
Vision-based perception for autonomous driving requires an explicit modeling of a 3D space, where 2D latent representations are mapped and subsequent 3D operators are applied.
no code implementations • 13 Mar 2024 • Hao Shi, Song Wang, Jiaming Zhang, Xiaoting Yin, Zhongdao Wang, Zhijian Zhao, Guangming Wang, Jianke Zhu, Kailun Yang, Kaiwei Wang
Vision-based occupancy prediction, also known as 3D Semantic Scene Completion (SSC), presents a significant challenge in computer vision.
no code implementations • 7 Mar 2024 • Junsong Chen, Chongjian Ge, Enze Xie, Yue Wu, Lewei Yao, Xiaozhe Ren, Zhongdao Wang, Ping Luo, Huchuan Lu, Zhenguo Li
In this paper, we introduce PixArt-\Sigma, a Diffusion Transformer model~(DiT) capable of directly generating images at 4K resolution.
no code implementations • 28 Jan 2024 • Zhenyu Wang, Enze Xie, Aoxue Li, Zhongdao Wang, Xihui Liu, Zhenguo Li
Given a complex text prompt containing multiple concepts including objects, attributes, and relationships, the LLM agent initially decomposes it, which entails the extraction of individual objects, their associated attributes, and the prediction of a coherent scene layout.
2 code implementations • 30 Sep 2023 • Junsong Chen, Jincheng Yu, Chongjian Ge, Lewei Yao, Enze Xie, Yue Wu, Zhongdao Wang, James Kwok, Ping Luo, Huchuan Lu, Zhenguo Li
We hope PIXART-$\alpha$ will provide new insights to the AIGC community and startups to accelerate building their own high-quality yet low-cost generative models from scratch.
1 code implementation • ICCV 2023 • Zhaopeng Dou, Zhongdao Wang, YaLi Li, Shengjin Wang
To overcome the barriers of data and annotation, we propose to utilize large-scale unsupervised data for training.
Generalizable Person Re-identification Representation Learning
1 code implementation • 19 Apr 2023 • Chongjian Ge, Junsong Chen, Enze Xie, Zhongdao Wang, Lanqing Hong, Huchuan Lu, Zhenguo Li, Ping Luo
These queries are then processed iteratively by a BEV-Evolving decoder, which selectively aggregates deep features from either LiDAR, cameras, or both modalities.
no code implementations • ICCV 2023 • Chongjian Ge, Junsong Chen, Enze Xie, Zhongdao Wang, Lanqing Hong, Huchuan Lu, Zhenguo Li, Ping Luo
These queries are then processed iteratively by a BEV-Evolving decoder, which selectively aggregates deep features from either LiDAR, cameras, or both modalities.
no code implementations • 7 Nov 2022 • Zhongdao Wang, Zhaopeng Dou, Jingwei Zhang, Liang Zheng, Yifan Sun, YaLi Li, Shengjin Wang
In this paper, we are interested in learning a generalizable person re-identification (re-ID) representation from unlabeled videos.
Domain Generalization Generalizable Person Re-identification +1
1 code implementation • 24 Oct 2022 • Zhaopeng Dou, Zhongdao Wang, Weihua Chen, YaLi Li, Shengjin Wang
(3) the data uncertainty and the model uncertainty are jointly learned in a unified network, and they serve as two fundamental criteria for the reliability assessment: if a probe is high-quality (low data uncertainty) and the model is confident in the prediction of the probe (low model uncertainty), the final ranking will be assessed as reliable.
1 code implementation • 20 Oct 2022 • Xin Liu, Zhongdao Wang, YaLi Li, Shengjin Wang
To cope with this issue, we propose Maximum Entropy Coding (MEC), a more principled objective that explicitly optimizes on the structure of the representation, so that the learned representation is less biased and thus generalizes better to unseen downstream tasks.
no code implementations • 14 Dec 2021 • Yunzhong Hou, Zhongdao Wang, Shengjin Wang, Liang Zheng
In this paper, we design experiments to verify such misfit between global re-ID feature distances and local matching in tracking, and propose a simple yet effective approach to adapt affinity estimations to corresponding matching scopes in MTMCT.
1 code implementation • 3 Dec 2021 • Yuchi Liu, Zhongdao Wang, Tom Gedeon, Liang Zheng
To this end, we develop a protocol to automatically synthesize large scale MiE training data that allow us to train improved recognition models for real-world test data.
1 code implementation • NeurIPS 2021 • Zhongdao Wang, Hengshuang Zhao, Ya-Li Li, Shengjin Wang, Philip H. S. Torr, Luca Bertinetto
We show how most tracking tasks can be solved within this framework, and that the same appearance model can be successfully used to obtain results that are competitive against specialised methods for most of the tasks considered.
Ranked #2 on Video Object Segmentation on DAVIS 2017 (mIoU metric)
Multi-Object Tracking Multi-Object Tracking and Segmentation +10
no code implementations • 30 Jun 2021 • Yuchi Liu, Zhongdao Wang, Xiangxin Zhou, Liang Zheng
We show that compared with real data, association knowledge obtained from synthetic data can achieve very similar performance on real-world test sets without domain adaption techniques.
no code implementations • ECCV 2020 • Zhongdao Wang, Jingwei Zhang, Liang Zheng, Yixuan Liu, Yifan Sun, Ya-Li Li, Shengjin Wang
This paper proposes a self-supervised learning method for the person re-identification (re-ID) problem, where existing unsupervised methods usually rely on pseudo labels, such as those from video tracklets or clustering.
12 code implementations • CVPR 2020 • Yifan Sun, Changmao Cheng, Yuhan Zhang, Chi Zhang, Liang Zheng, Zhongdao Wang, Yichen Wei
This paper provides a pair similarity optimization viewpoint on deep feature learning, aiming to maximize the within-class similarity $s_p$ and minimize the between-class similarity $s_n$.
Ranked #1 on Face Verification on IJB-C (training dataset metric)
1 code implementation • 27 Nov 2019 • Yunzhong Hou, Liang Zheng, Zhongdao Wang, Shengjin Wang
Due to the continuity of target trajectories, tracking systems usually restrict their data association within a local neighborhood.
12 code implementations • ECCV 2020 • Zhongdao Wang, Liang Zheng, Yixuan Liu, Ya-Li Li, Shengjin Wang
In this paper, we propose an MOT system that allows target detection and appearance embedding to be learned in a shared model.
Ranked #4 on Multi-Object Tracking on HiEve
no code implementations • 4 Aug 2019 • Lanqing He, Zhongdao Wang, Ya-Li Li, Shengjin Wang
The softmax loss and its variants are widely used as objectives for embedding learning, especially in applications like face recognition.
4 code implementations • CVPR 2019 • Zhongdao Wang, Liang Zheng, Ya-Li Li, Shengjin Wang
The key idea is that we find the local context in the feature space around an instance (face) contains rich information about the linkage relationship between this instance and its neighbors.
no code implementations • 31 Oct 2018 • Zhongdao Wang, Liang Zheng, Shengjin Wang
That is to say, for some queries, a feature may be neither discriminative nor complementary to existing ones, while for other queries, the feature suffices.
no code implementations • ICCV 2017 • Zhongdao Wang, Luming Tang, Xihui Liu, Zhuliang Yao, Shuai Yi, Jing Shao, Junjie Yan, Shengjin Wang, Hongsheng Li, Xiaogang Wang
In our vehicle ReID framework, an orientation invariant feature embedding module and a spatial-temporal regularization module are proposed.