Search Results for author: Zixu Zhao

Found 16 papers, 5 papers with code

Unsupervised Open-Vocabulary Object Localization in Videos

no code implementations • ICCV 2023 • Ke Fan, Zechen Bai, Tianjun Xiao, Dominik Zietlow, Max Horn, Zixu Zhao, Carl-Johann Simon-Gabriel, Mike Zheng Shou, Francesco Locatello, Bernt Schiele, Thomas Brox, Zheng Zhang, Yanwei Fu, Tong He

In this paper, we show that recent advances in video representation learning and pre-trained vision-language models allow for substantial improvements in self-supervised video object localization.

Object Object Localization +1

Paper
Add Code

Object-Centric Multiple Object Tracking

1 code implementation • ICCV 2023 • Zixu Zhao, Jiaze Wang, Max Horn, Yizhuo Ding, Tong He, Zechen Bai, Dominik Zietlow, Carl-Johann Simon-Gabriel, Bing Shuai, Zhuowen Tu, Thomas Brox, Bernt Schiele, Yanwei Fu, Francesco Locatello, Zheng Zhang, Tianjun Xiao

Unsupervised object-centric learning methods allow the partitioning of scenes into entities without additional localization information and are excellent candidates for reducing the annotation burden of multiple-object tracking (MOT) pipelines.

Multiple Object Tracking Object +3

Paper
Code

Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering

1 code implementation • 11 Jul 2023 • Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong

Medical visual question answering (VQA) is a challenging task that requires answering clinical questions of a given medical image, by taking consider of both visual and language information.

Ranked #1 on Medical Visual Question Answering on PathVQA

Medical Visual Question Answering

Paper
Code

PointPatchMix: Point Cloud Mixing with Patch Scoring

no code implementations • 12 Mar 2023 • Yi Wang, Jiaze Wang, Jinpeng Li, Zixu Zhao, Guangyong Chen, Anfeng Liu, Pheng-Ann Heng

With Point-MAE as our baseline, our model surpasses previous methods by a significant margin, achieving 86. 3% accuracy on ScanObjectNN and 94. 1% accuracy on ModelNet40.

Data Augmentation

Paper
Add Code

Pseudo-label Guided Cross-video Pixel Contrast for Robotic Surgical Scene Segmentation with Limited Annotations

no code implementations • 20 Jul 2022 • Yang Yu, Zixu Zhao, Yueming Jin, Guangyong Chen, Qi Dou, Pheng-Ann Heng

Concretely, for trusty representation learning, we propose to incorporate pseudo labels to instruct the pair selection, obtaining more reliable representation pairs for pixel contrast.

Pseudo Label Representation Learning +2

Paper
Add Code

Exploring Intra- and Inter-Video Relation for Surgical Semantic Scene Segmentation

1 code implementation • 29 Mar 2022 • Yueming Jin, Yang Yu, Cheng Chen, Zixu Zhao, Pheng-Ann Heng, Danail Stoyanov

Automatic surgical scene segmentation is fundamental for facilitating cognitive intelligence in the modern operating theatre.

Contrastive Learning Relation +1

Paper
Code

TraSeTR: Track-to-Segment Transformer with Contrastive Query for Instance-level Instrument Segmentation in Robotic Surgery

no code implementations • 17 Feb 2022 • Zixu Zhao, Yueming Jin, Pheng-Ann Heng

Specifically, we introduce the prior query that encoded with previous temporal knowledge, to transfer tracking signals to current instances via identity matching.

Segmentation

Paper
Add Code

Modelling Neighbor Relation in Joint Space-Time Graph for Video Correspondence Learning

no code implementations • ICCV 2021 • Zixu Zhao, Yueming Jin, Pheng-Ann Heng

This paper presents a self-supervised method for learning reliable visual correspondence from unlabeled videos.

Contrastive Learning Relation

Paper
Add Code

Temporal Memory Relation Network for Workflow Recognition from Surgical Video

1 code implementation • 30 Mar 2021 • Yueming Jin, Yonghao Long, Cheng Chen, Zixu Zhao, Qi Dou, Pheng-Ann Heng

In this paper, we propose a novel end-to-end temporal memory relation network (TMRNet) for relating long-range and multi-scale temporal patterns to augment the present features.

Relation Relation Network

Paper
Code

One to Many: Adaptive Instrument Segmentation via Meta Learning and Dynamic Online Adaptation in Robotic Surgical Video

no code implementations • 24 Mar 2021 • Zixu Zhao, Yueming Jin, Bo Lu, Chi-Fai Ng, Qi Dou, Yun-hui Liu, Pheng-Ann Heng

To greatly increase the label efficiency, we explore a new problem, i. e., adaptive instrument segmentation, which is to effectively adapt one source model to new robotic surgical videos from multiple target domains, only given the annotated instruments in the first frame.

General Knowledge Meta-Learning

Paper
Add Code

Future Frame Prediction for Robot-assisted Surgery

no code implementations • 18 Mar 2021 • Xiaojie Gao, Yueming Jin, Zixu Zhao, Qi Dou, Pheng-Ann Heng

Predicting future frames for robotic surgical video is an interesting, important yet extremely challenging problem, given that the operative tasks may have complex dynamics.

Future prediction Optical Flow Estimation

Paper
Add Code

Learning Motion Flows for Semi-supervised Instrument Segmentation from Robotic Surgical Video

1 code implementation • 6 Jul 2020 • Zixu Zhao, Yueming Jin, Xiaojie Gao, Qi Dou, Pheng-Ann Heng

Considering the fast instrument motion, we further introduce a flow compensator to estimate intermediate motion within continuous frames, with a novel cycle learning strategy.

Segmentation

Paper
Code

PFA-ScanNet: Pyramidal Feature Aggregation with Synergistic Learning for Breast Cancer Metastasis Analysis

no code implementations • 3 May 2019 • Zixu Zhao, Huangjing Lin, Hao Chen, Pheng-Ann Heng

Automatic detection of cancer metastasis from whole slide images (WSIs) is a crucial step for following patient staging and prognosis.

Computational Efficiency Decoder +1

Paper
Add Code

Matlab Implementation of Machine Vision Algorithm on Ballast Degradation Evaluation

no code implementations • 24 Apr 2018 • Zixu Zhao

By comparing the segment results and their corresponding FI values, this novel method produces a machine-vision-based index that has the best-fit relation with FI.

Image Segmentation Semantic Segmentation

Paper
Add Code

Constructing Locally Dense Point Clouds Using OpenSfM and ORB-SLAM2

no code implementations • 23 Apr 2018 • Fouad Amer, Zixu Zhao, Siwei Tang, Wilfredo Torres

By matching the ORB feature of the tags with their corresponding features in the scene, it is then possible to localize the position of these tags both in point clouds constructed by ORB-SLAM2 and OpenSfM.

Position

Paper
Add Code

MRI Cross-Modality NeuroImage-to-NeuroImage Translation

no code implementations • 22 Jan 2018 • Qianye Yang, Nannan Li, Zixu Zhao, Xingyu Fan, Eric I-Chao Chang, Yan Xu

Based on our proposed framework, we first propose a method for cross-modality registration by fusing the deformation fields to adopt the cross-modality information from translated modalities.

MRI segmentation Segmentation +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.