no code implementations • 21 Feb 2023 • Zeyu Xiong, Daizong Liu, Pan Zhou, Jiahao Zhu
Temporal sentence grounding (TSG) aims to localize the temporal segment which is semantically aligned with a natural language query in an untrimmed video. Most existing methods extract frame-grained features or object-grained features by 3D ConvNet or detection network under a conventional TSG framework, failing to capture the subtle differences between frames or to model the spatio-temporal behavior of core persons/objects.
no code implementations • 2 Jan 2023 • Jiahao Zhu, Daizong Liu, Pan Zhou, Xing Di, Yu Cheng, Song Yang, Wenzheng Xu, Zichuan Xu, Yao Wan, Lichao Sun, Zeyu Xiong
All existing works first utilize a sparse sampling strategy to extract a fixed number of video frames and then conduct multi-modal interactions with query sentence for reasoning.
no code implementations • 18 May 2022 • Jiahao Zhu, Huajun Zhou, Zixuan Chen, Yi Zhou, Xiaohua Xie
3D deep models consuming point clouds have achieved sound application effects in computer vision.
no code implementations • 19 May 2021 • Hua Zheng, Jiahao Zhu, Wei Xie, Judy Zhong
We developed a machine learning algorithm, based on a deep Reinforcement Learning (RL), for continuous management of oxygen flow rate for critical ill patients under intensive care, which can identify the optimal personalized oxygen flow rate with strong potentials to reduce mortality rate relative to the current clinical practice.