Search Results for author: Yike Zhang

Found 11 papers, 1 papers with code

Monocular Microscope to CT Registration using Pose Estimation of the Incus for Augmented Reality Cochlear Implant Surgery

no code implementations • 12 Mar 2024 • Yike Zhang, Eduardo Davalos, Dingjie Su, Ange Lou, Jack H. Noble

For those experiencing severe-to-profound sensorineural hearing loss, the cochlear implant (CI) is the preferred treatment.

Anatomy Computed Tomography (CT) +1

Paper
Add Code

SAMSNeRF: Segment Anything Model (SAM) Guides Dynamic Surgical Scene Reconstruction by Neural Radiance Field (NeRF)

no code implementations • 22 Aug 2023 • Ange Lou, Yamin Li, Xing Yao, Yike Zhang, Jack Noble

The accurate reconstruction of surgical scenes from surgical videos is critical for various applications, including intraoperative navigation and image-guided robotic surgery automation.

Depth Estimation Position

Paper
Add Code

Self-supervised Registration and Segmentation of the Ossicles with A Single Ground Truth Label

no code implementations • 15 Feb 2023 • Yike Zhang, Jack Noble

AI-assisted surgeries have drawn the attention of the medical image research community due to their real-world impact on improving surgery success rates.

Image Segmentation Segmentation +1

Paper
Add Code

Two Stage Contextual Word Filtering for Context bias in Unified Streaming and Non-streaming Transducer

no code implementations • 17 Jan 2023 • Zhanheng Yang, Sining Sun, Xiong Wang, Yike Zhang, Long Ma, Lei Xie

In this paper, we propose an efficient approach to obtain a high quality contextual list for a unified streaming/non-streaming based E2E model.

Paper
Add Code

Leveraging Acoustic Contextual Representation by Audio-textual Cross-modal Learning for Conversational ASR

no code implementations • 3 Jul 2022 • Kun Wei, Yike Zhang, Sining Sun, Lei Xie, Long Ma

Then, during the training of the conversational ASR system, the extractor will be frozen to extract the textual representation of preceding speech, while such representation is used as context fed to the ASR decoder through attention mechanism.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

A practical framework for multi-domain speech recognition and an instance sampling method to neural language modeling

no code implementations • 9 Mar 2022 • Yike Zhang, Xiaobing Feng, Yi Liu, Songjun Cao, Long Ma

Automatic speech recognition (ASR) systems used on smart phones or vehicles are usually required to process speech queries from very different domains.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Improving CTC-based speech recognition via knowledge transferring from pre-trained language models

1 code implementation • 22 Feb 2022 • Keqi Deng, Songjun Cao, Yike Zhang, Long Ma, Gaofeng Cheng, Ji Xu, Pengyuan Zhang

Recently, end-to-end automatic speech recognition models based on connectionist temporal classification (CTC) have achieved impressive results, especially when fine-tuned from wav2vec2. 0 models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

Conversational Speech Recognition By Learning Conversation-level Characteristics

no code implementations • 16 Feb 2022 • Kun Wei, Yike Zhang, Sining Sun, Lei Xie, Long Ma

Conversational automatic speech recognition (ASR) is a task to recognize conversational speech including multiple speakers.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Model

no code implementations • 14 Dec 2021 • Keqi Deng, Songjun Cao, Yike Zhang, Long Ma

In our framework, the encoder is initialized with a pretrained AM (wav2vec2. 0).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Improving Streaming Transformer Based ASR Under a Framework of Self-supervised Learning

no code implementations • 15 Sep 2021 • Songjun Cao, Yueteng Kang, Yanzhe Fu, Xiaoshuo Xu, Sining Sun, Yike Zhang, Long Ma

Under such a framework, the neural network is usually pre-trained with massive unlabeled data and then fine-tuned with limited labeled data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Improving Speech Recognition Accuracy of Local POI Using Geographical Models

no code implementations • 7 Jul 2021 • Songjun Cao, Yike Zhang, Xiaobing Feng, Long Ma

Secondly, a group of geo-specific language models (Geo-LMs) are integrated into our speech recognition system to improve recognition accuracy of long tail and homophone POI.

speech-recognition Speech Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.