Search Results for author: Huan Zhou

Found 11 papers, 2 papers with code

Unimodal and Crossmodal Refinement Network for Multimodal Sequence Fusion

no code implementations • EMNLP 2021 • Xiaobao Guo, Adams Kong, Huan Zhou, Xianfeng Wang, Min Wang

Specifically, to improve unimodal representations, a unimodal refinement module is designed to refine modality-specific learning via iteratively updating the distribution with transformer-based attention layers.

Representation Learning

Paper
Add Code

MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition

no code implementations • 6 May 2024 • Bingshen Mu, Yangze Li, Qijie Shao, Kun Wei, Xucheng Wan, Naijun Zheng, Huan Zhou, Lei Xie

Accents represent deviations from standard pronunciation norms, and the multi-task learning framework for simultaneous ASR and accent recognition (AR) has effectively addressed the multi-accent scenarios, making it a prominent solution.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder

no code implementations • 8 Apr 2024 • He Wang, Pengcheng Guo, Xucheng Wan, Huan Zhou, Lei Xie

Automatic lip-reading (ALR) aims to automatically transcribe spoken content from a speaker's silent lip motion captured in video.

Lipreading Lip Reading +1

Paper
Add Code

Exploiting Low-level Representations for Ultra-Fast Road Segmentation

1 code implementation • 4 Feb 2024 • Huan Zhou, Feng Xue, Yucong Li, Shi Gong, Yiqun Li, Yu Zhou

The spatial detail branch is firstly designed to extract low-level feature representation for the road by the first stage of ResNet-18.

Road Segmentation

Paper
Code

X-SepFormer: End-to-end Speaker Extraction Network with Explicit Optimization on Speaker Confusion

no code implementations • 9 Mar 2023 • Kai Liu, Ziqing Du, Xucheng Wan, Huan Zhou

To mitigate the imperative SC issue, we reformulate the training objective and propose two novel loss schemes that explore the metric of reconstruction improvement performance defined at small chunk-level and leverage the metric associated distribution information.

Speech Extraction

Paper
Add Code

Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings

no code implementations • 16 Jan 2023 • Kai Liu, Xucheng Wan, Ziqing Du, Huan Zhou

As a practical alternative of speech separation, target speaker extraction (TSE) aims to extract the speech from the desired speaker using additional speaker cue extracted from the speaker.

Speaker Verification Speech Separation +1

Paper
Add Code

CGI-Stereo: Accurate and Real-Time Stereo Matching via Context and Geometry Interaction

1 code implementation • 7 Jan 2023 • Gangwei Xu, Huan Zhou, Xin Yang

In this paper, we propose CGI-Stereo, a novel neural network architecture that can concurrently achieve real-time performance, competitive accuracy, and strong generalization ability.

Stereo Matching

294

Paper
Code

Speech Enhancement with Perceptually-motivated Optimization and Dual Transformations

no code implementations • 24 Sep 2022 • Xucheng Wan, Kai Liu, Ziqing Du, Huan Zhou

To validate the effectiveness of our proposed model, extensive experiments are conducted on the DNS2020 dataset.

Speech Enhancement

Paper
Add Code

Joint Speech Activity and Overlap Detection with Multi-Exit Architecture

no code implementations • 24 Sep 2022 • Ziqing Du, Kai Liu, Xucheng Wan, Huan Zhou

Overlapped speech detection (OSD) is critical for speech applications in scenario of multi-party conversion.

Action Detection Activity Detection +1

Paper
Add Code

Breast Cancer Molecular Subtypes Prediction on Pathological Images with Discriminative Patch Selecting and Multi-Instance Learning

no code implementations • 15 Mar 2022 • Hong Liu, Wen-Dong Xu, Zi-Hao Shang, Xiang-Dong Wang, Hai-Yan Zhou, Ke-Wen Ma, Huan Zhou, Jia-Lin Qi, Jia-Rui Jiang, Li-Lan Tan, Hui-Min Zeng, Hui-Juan Cai, Kuan-Song Wang, Yue-Liang Qian

A weakly supervised learning framework based on discriminative patch selecting and multi-instance learning was proposed for breast cancer molecular subtype prediction from H&E WSIs.

Weakly-supervised Learning whole slide images

Paper
Add Code

Container Orchestration on HPC Systems

no code implementations • 16 Dec 2020 • Naweiluo Zhou, Yiannis Georgiou, Li Zhong, Huan Zhou, Marcin Pospieszny

Containerisation demonstrates its efficiency in application deployment in cloud computing.

Distributed, Parallel, and Cluster Computing

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.