Search Results for author: Jiuniu Wang

Found 17 papers, 6 papers with code

AesopAgent: Agent-driven Evolutionary System on Story-to-Video Production

no code implementations • 12 Mar 2024 • Jiuniu Wang, Zehua Du, Yuyuan Zhao, Bo Yuan, Kexiang Wang, Jian Liang, Yaxi Zhao, Yihen Lu, Gengliang Li, Junlong Gao, Xin Tu, Zhenyu Guo

In the Horizontal Layer, we introduce a novel RAG-based evolutionary system that optimizes the whole video generation workflow and the steps within the workflow.

Image Generation Video Generation +1

Paper
Add Code

Deep Semantic-Visual Alignment for Zero-Shot Remote Sensing Image Scene Classification

1 code implementation • 3 Feb 2024 • Wenjia Xu, Jiuniu Wang, Zhiwei Wei, Mugen Peng, Yirong Wu

Besides, pioneer ZSL models use convolutional neural networks pre-trained on ImageNet, which focus on the main objects appearing in each image, neglecting the background context that also matters in RS scene classification.

Attribute Image Classification +4

Paper
Code

Jointly Optimized Global-Local Visual Localization of UAVs

no code implementations • 12 Oct 2023 • Haoling Li, Jiuniu Wang, Zhiwei Wei, Wenjia Xu

Our GLVL network is a two-stage visual localization approach, combining a large-scale retrieval module that finds similar regions with the UAV flight scene, and a fine-grained matching module that localizes the precise UAV coordinate, enabling real-time and precise localization.

Retrieval Simultaneous Localization and Mapping +2

Paper
Add Code

ModelScope Text-to-Video Technical Report

3 code implementations • 12 Aug 2023 • Jiuniu Wang, Hangjie Yuan, Dayou Chen, Yingya Zhang, Xiang Wang, Shiwei Zhang

This paper introduces ModelScopeT2V, a text-to-video synthesis model that evolves from a text-to-image synthesis model (i. e., Stable Diffusion).

Ranked #8 on Text-to-Video Generation on MSR-VTT

Denoising Image Generation +1

2,587

Paper
Code

VideoComposer: Compositional Video Synthesis with Motion Controllability

4 code implementations • NeurIPS 2023 • Xiang Wang, Hangjie Yuan, Shiwei Zhang, Dayou Chen, Jiuniu Wang, Yingya Zhang, Yujun Shen, Deli Zhao, Jingren Zhou

The pursuit of controllability as a higher standard of visual content creation has yielded remarkable progress in customizable image synthesis.

Ranked #5 on Text-to-Video Generation on EvalCrafter Text-to-Video (ECTV) Dataset (using extra training data)

Image Generation Text-to-Video Generation

842

Paper
Code

Distinctive Image Captioning via CLIP Guided Group Optimization

no code implementations • 8 Aug 2022 • Youyuan Zhang, Jiuniu Wang, Hao Wu, Wenjia Xu

Image captioning models are usually trained according to human annotated ground-truth captions, which could generate accurate but generic captions.

Image Captioning

Paper
Add Code

Multi-dimension Geospatial feature learning for urban region function recognition

no code implementations • 18 Jul 2022 • Wenjia Xu, Jiuniu Wang, Yirong Wu

In this paper, we propose a Multi-dimension Feature Learning Model~(MDFL) using high-dimensional GBD data in conjunction with RS images for urban region function recognition.

Paper
Add Code

On Distinctive Image Captioning via Comparing and Reweighting

no code implementations • 8 Apr 2022 • Jiuniu Wang, Wenjia Xu, Qingzhong Wang, Antoni B. Chan

First, we propose a distinctiveness metric -- between-set CIDEr (CIDErBtw) to evaluate the distinctiveness of a caption with respect to those of similar images.

Image Captioning Retrieval +1

Paper
Add Code

Attribute Prototype Network for Any-Shot Learning

no code implementations • 4 Apr 2022 • Wenjia Xu, Yongqin Xian, Jiuniu Wang, Bernt Schiele, Zeynep Akata

While a visual-semantic embedding layer learns global features, local features are learned through an attribute prototype network that simultaneously regresses and decorrelates attributes from intermediate features.

Ranked #5 on GZSL Video Classification on ActivityNet-GZSL(main)

Attribute Few-Shot Image Classification +2

Paper
Add Code

VGSE: Visually-Grounded Semantic Embeddings for Zero-Shot Learning

1 code implementation • CVPR 2022 • Wenjia Xu, Yongqin Xian, Jiuniu Wang, Bernt Schiele, Zeynep Akata

Our model visually divides a set of images from seen classes into clusters of local image regions according to their visual similarity, and further imposes their class discrimination and semantic relatedness.

Transfer Learning Word Embeddings +1

Paper
Code

Group-based Distinctive Image Captioning with Memory Attention

no code implementations • 20 Aug 2021 • Jiuniu Wang, Wenjia Xu, Qingzhong Wang, Antoni B. Chan

In particular, we propose a group-based memory attention (GMA) module, which stores object features that are unique among the image group (i. e., with low similarity to objects in other images).

Contrastive Learning Image Captioning +1

Paper
Add Code

Where is the Model Looking At?--Concentrate and Explain the Network Attention

no code implementations • 29 Sep 2020 • Wenjia Xu, Jiuniu Wang, Yang Wang, Guangluan Xu, Wei Dai, Yirong Wu

We generate attribute-based textual explanations for the network and ground the attributes on the image to show visual explanations.

Attribute Image Classification +1

Paper
Add Code

ASTRAL: Adversarial Trained LSTM-CNN for Named Entity Recognition

1 code implementation • 2 Sep 2020 • Jiuniu Wang, Wenjia Xu, Xingyu Fu, Guangluan Xu, Yirong Wu

Under such circumstances, how to make full use of the information extracted by word embedding requires more in-depth research.

named-entity-recognition Named Entity Recognition +1

Paper
Code

SRQA: Synthetic Reader for Factoid Question Answering

1 code implementation • 2 Sep 2020 • Jiuniu Wang, Wenjia Xu, Xingyu Fu, Yang Wei, Li Jin, Ziyan Chen, Guangluan Xu, Yirong Wu

This model enhances the question answering system in the multi-document scenario from three aspects: model structure, optimization goal, and training method, corresponding to Multilayer Attention (MA), Cross Evidence (CE), and Adversarial Training (AT) respectively.

Question Answering

Paper
Code

Attribute Prototype Network for Zero-Shot Learning

no code implementations • NeurIPS 2020 • Wenjia Xu, Yongqin Xian, Jiuniu Wang, Bernt Schiele, Zeynep Akata

As an additional benefit, our model points to the visual evidence of the attributes in an image, e. g. for the CUB dataset, confirming the improved attribute localization ability of our image representation.

Attribute Representation Learning +1

Paper
Add Code

Compare and Reweight: Distinctive Image Captioning Using Similar Images Sets

no code implementations • ECCV 2020 • Jiuniu Wang, Wenjia Xu, Qingzhong Wang, Antoni B. Chan

A wide range of image captioning models has been developed, achieving significant improvement based on popular metrics, such as BLEU, CIDEr, and SPICE.

Image Captioning Retrieval

Paper
Add Code

A3Net: Adversarial-and-Attention Network for Machine Reading Comprehension

no code implementations • 3 Sep 2018 • Jiuniu Wang, Xingyu Fu, Guangluan Xu, Yirong Wu, Ziyan Chen, Yang Wei, Li Jin

Meanwhile, we construct A3Net for the WebQA dataset.

Machine Reading Comprehension

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.