Search Results for author: Dong Shen

Found 9 papers, 4 papers with code

ContentCTR: Frame-level Live Streaming Click-Through Rate Prediction with Multimodal Transformer

no code implementations26 Jun 2023 Jiaxin Deng, Dong Shen, Shiyao Wang, Xiangyu Wu, Fan Yang, Guorui Zhou, Gaofeng Meng

However, most previous works treat the live as a whole item and explore the Click-through-Rate (CTR) prediction framework on item-level, neglecting that the dynamic changes that occur even within the same live room.

Click-Through Rate Prediction Dynamic Time Warping +1

Generation-Guided Multi-Level Unified Network for Video Grounding

no code implementations14 Mar 2023 Xing Cheng, Xiangyu Wu, Dong Shen, Hezheng Lin, Fan Yang

Video grounding aims to locate the timestamps best matching the query description within an untrimmed video.

Video Grounding

A Unified Model for Video Understanding and Knowledge Embedding with Heterogeneous Knowledge Graph Dataset

no code implementations19 Nov 2022 Jiaxin Deng, Dong Shen, Haojie Pan, Xiangyu Wu, Ximan Liu, Gaofeng Meng, Fan Yang, Size Li, Ruiji Fu, Zhongyuan Wang

Furthermore, based on this dataset, we propose an end-to-end model that jointly optimizes the video understanding objective with knowledge graph embedding, which can not only better inject factual knowledge into video understanding but also generate effective multi-modal entity embedding for KG.

Common Sense Reasoning Knowledge Graph Embedding +4

Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss

2 code implementations9 Sep 2021 Xing Cheng, Hezheng Lin, Xiangyu Wu, Fan Yang, Dong Shen

In this paper, we propose a multi-stream Corpus Alignment network with single gate Mixture-of-Experts (CAMoE) and a novel Dual Softmax Loss (DSL) to solve the two heterogeneity.

Ranked #9 on Video Retrieval on MSVD (using extra training data)

Retrieval Text Retrieval +1

CAT: Cross Attention in Vision Transformer

1 code implementation10 Jun 2021 Hezheng Lin, Xing Cheng, Xiangyu Wu, Fan Yang, Dong Shen, Zhongyuan Wang, Qing Song, Wei Yuan

In this paper, we propose a new attention mechanism in Transformer termed Cross Attention, which alternates attention inner the image patch instead of the whole image to capture local information and apply attention between image patches which are divided from single-channel feature maps capture global information.

ES-Net: Erasing Salient Parts to Learn More in Re-Identification

no code implementations10 Mar 2021 Dong Shen, Shuai Zhao, Jinming Hu, Hao Feng, Deng Cai, Xiaofei He

In this paper, we propose a novel network, Erasing-Salient Net (ES-Net), to learn comprehensive features by erasing the salient areas in an image.

Complementary Pseudo Labels For Unsupervised Domain Adaptation On Person Re-identification

no code implementations29 Jan 2021 Hao Feng, Minghao Chen, Jinming Hu, Dong Shen, Haifeng Liu, Deng Cai

In this paper, to complement these low recall neighbor pseudo labels, we propose a joint learning framework to learn better feature embeddings via high precision neighbor pseudo labels and high recall group pseudo labels.

Person Re-Identification Unsupervised Domain Adaptation

Progressive Transfer Learning

1 code implementation7 Aug 2019 Zhengxu Yu, Dong Shen, Zhongming Jin, Jianqiang Huang, Deng Cai, Xian-Sheng Hua

Model fine-tuning is a widely used transfer learning approach in person Re-identification (ReID) applications, which fine-tuning a pre-trained feature extraction model into the target scenario instead of training a model from scratch.

Image Classification Person Re-Identification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.