no code implementations • 4 Feb 2024 • Xinyuan Chen, Liangyuan Hu, Fan Li
To enhance the traditional parametric g-formula approach, we developed a more adaptable Bayesian g-formula estimator.
1 code implementation • 17 Jan 2024 • Shaobin Zhuang, Kunchang Li, Xinyuan Chen, Yaohui Wang, Ziwei Liu, Yu Qiao, Yali Wang
More importantly, Vlogger can generate over 5-minute vlogs from open-world descriptions, without loss of video coherence on script and actor.
2 code implementations • 5 Jan 2024 • Xin Ma, Yaohui Wang, Gengyun Jia, Xinyuan Chen, Ziwei Liu, Yuan-Fang Li, Cunjian Chen, Yu Qiao
We propose a novel Latent Diffusion Transformer, namely Latte, for video generation.
1 code implementation • 19 Dec 2023 • Lingjun Zhang, Xinyuan Chen, Yaohui Wang, Yue Lu, Yu Qiao
To tackle this problem, we propose Diff-Text, which is a training-free scene text generation framework for any language.
no code implementations • 11 Dec 2023 • Zehuan Huang, Hao Wen, Junting Dong, Yaohui Wang, Yangguang Li, Xinyuan Chen, Yan-Pei Cao, Ding Liang, Yu Qiao, Bo Dai, Lu Sheng
Generating multiview images from a single view facilitates the rapid generation of a 3D mesh conditioned on a single image.
1 code implementation • 29 Nov 2023 • Ziqi Huang, Yinan He, Jiashuo Yu, Fan Zhang, Chenyang Si, Yuming Jiang, Yuanhan Zhang, Tianxing Wu, Qingyang Jin, Nattapol Chanpaisit, Yaohui Wang, Xinyuan Chen, LiMin Wang, Dahua Lin, Yu Qiao, Ziwei Liu
We will open-source VBench, including all prompts, evaluation methods, generated videos, and human preference annotations, and also include more video generation models in VBench to drive forward the field of video generation.
1 code implementation • 23 Nov 2023 • YuFei Wang, Wenhan Yang, Xinyuan Chen, Yaohui Wang, Lanqing Guo, Lap-Pui Chau, Ziwei Liu, Yu Qiao, Alex C. Kot, Bihan Wen
Extensive experiments conducted on synthetic and real-world datasets demonstrate that the proposed method can achieve comparable or even superior performance compared to both previous SOTA methods and the teacher model, in just one sampling step, resulting in a remarkable up to x10 speedup for inference.
no code implementations • 31 Oct 2023 • Xinyuan Chen, Yaohui Wang, Lingjun Zhang, Shaobin Zhuang, Xin Ma, Jiashuo Yu, Yali Wang, Dahua Lin, Yu Qiao, Ziwei Liu
The goal is to generate high-quality long videos with smooth and creative transitions between scenes and varying lengths of shot-level videos.
1 code implementation • 11 Oct 2023 • Bo Peng, Xinyuan Chen, Yaohui Wang, Chaochao Lu, Yu Qiao
In this work, we introduce ConditionVideo, a training-free approach to text-to-video generation based on the provided condition, video, and input text, by leveraging the power of off-the-shelf text-to-image generation methods (e. g., Stable Diffusion).
2 code implementations • 26 Sep 2023 • Yaohui Wang, Xinyuan Chen, Xin Ma, Shangchen Zhou, Ziqi Huang, Yi Wang, Ceyuan Yang, Yinan He, Jiashuo Yu, Peiqing Yang, Yuwei Guo, Tianxing Wu, Chenyang Si, Yuming Jiang, Cunjian Chen, Chen Change Loy, Bo Dai, Dahua Lin, Yu Qiao, Ziwei Liu
To this end, we propose LaVie, an integrated video generation framework that operates on cascaded video latent diffusion models, comprising a base T2V model, a temporal interpolation model, and a video super-resolution model.
Ranked #4 on Text-to-Video Generation on EvalCrafter Text-to-Video (ECTV) Dataset (using extra training data)
no code implementations • 20 Sep 2023 • Zizhang Wu, Xinyuan Chen, Fan Song, Yuanzhu Gan, Tianhao Xu, Jian Pu, Rui Tang
In this paper, wepresent the Parking Pedestrian Dataset (PPD), a large-scale fisheye dataset to support research dealing with real-world pedestrians, especially with occlusions and diverse postures.
1 code implementation • 13 Jul 2023 • Yi Wang, Yinan He, Yizhuo Li, Kunchang Li, Jiashuo Yu, Xin Ma, Xinhao Li, Guo Chen, Xinyuan Chen, Yaohui Wang, Conghui He, Ping Luo, Ziwei Liu, Yali Wang, LiMin Wang, Yu Qiao
Specifically, we utilize a multi-scale approach to generate video-related descriptions.
no code implementations • 25 Jun 2023 • Yangchen Xie, Xinyuan Chen, Hongjian Zhan, Palaiahankote Shivakum, Bing Yin, Cong Liu, Yue Lu
A large number of annotated training images is crucial for training successful scene text recognition models.
5 code implementations • 6 May 2023 • Yaohui Wang, Xin Ma, Xinyuan Chen, Antitza Dantcheva, Bo Dai, Yu Qiao
Our key idea is to represent motion as a sequence of flow maps in the generation process, which inherently isolate motion from appearance.
1 code implementation • 2 May 2023 • Jiashuo Yu, Yaohui Wang, Xinyuan Chen, Xiao Sun, Yu Qiao
To this end, we present Long-Term Rhythmic Video Soundtracker (LORIS), a novel framework to synthesize long-term conditional waveforms.
no code implementations • 24 Apr 2023 • Zeyu Lu, Chengyue Wu, Xinyuan Chen, Yaohui Wang, Lei Bai, Yu Qiao, Xihui Liu
To mitigate those limitations, we propose Hierarchical Diffusion Autoencoders (HDAE) that exploit the fine-grained-to-abstract and lowlevel-to-high-level feature hierarchy for the latent space of diffusion models.
1 code implementation • 30 Dec 2022 • Xinyuan Chen, Yangchen Xie, Li Sun, Yue Lu
Moreover, we introduce contrastive self-supervised learning to learn a robust style representation for fonts by understanding the similarity and dissimilarities of fonts.
1 code implementation • 12 Dec 2022 • Haibin He, Xinyuan Chen, Chaoyue Wang, Juhua Liu, Bo Du, DaCheng Tao, Yu Qiao
Specifically, a large stroke-wise dataset is constructed, and a stroke-wise diffusion model is proposed to preserve the structure and the completion of each generated character.
no code implementations • 8 Dec 2022 • Zizhang Wu, Xinyuan Chen, Jizheng Wang, Xiaoquan Wang, Yuanzhu Gan, Muqing Fang, Tianhao Xu
Obtaining the position of ego-vehicle is a crucial prerequisite for automatic control and path planning in the field of autonomous driving.
1 code implementation • 1 Aug 2022 • Xinyue Zhou, Mingyu Yin, Xinyuan Chen, Li Sun, Changxin Gao, Qingli Li
In this paper, we propose a cross attention based style distribution module that computes between the source semantic styles and target pose for pose transfer.
1 code implementation • CVPR 2021 • Yangchen Xie, Xinyuan Chen, Li Sun, Yue Lu
Font generation is a challenging problem especially for some writing systems that consist of a large number of characters and has attracted a lot of attention in recent years.
2 code implementations • 4 Apr 2019 • Xinyuan Chen, Chang Xu, Xiaokang Yang, Li Song, DaCheng Tao
We propose adversarial gated networks (Gated GAN) to transfer multiple styles in a single model.
no code implementations • ECCV 2018 • Xinyuan Chen, Chang Xu, Xiaokang Yang, DaCheng Tao
This paper studies the object transfiguration problem in wild images.
no code implementations • 5 May 2017 • Minne Li, Zhaoning Zhang, Hao Yu, Xinyuan Chen, Dongsheng Li
S-OHEM exploits OHEM with stratified sampling, a widely-adopted sampling technique, to choose the training examples according to this influence during hard example mining, and thus enhance the performance of object detectors.