Search Results for author: Ke Hu

Found 27 papers, 4 papers with code

A Reception Study of Machine Translated Subtitles for MOOCs

no code implementations • MTSummit 2017 • Ke Hu, Sharon O’Brien, Dorothy Kenny

Paper
Add Code

Enhancing Visual Continual Learning with Language-Guided Supervision

no code implementations • 24 Mar 2024 • Bolin Ni, Hongbo Zhao, Chenghao Zhang, Ke Hu, Gaofeng Meng, Zhaoxiang Zhang, Shiming Xiang

Existing methods commonly utilize the one-hot labels and randomly initialize the classifier head.

Class Incremental Learning Incremental Learning +1

Paper
Add Code

Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study

no code implementations • 23 Jan 2024 • W. Ronny Huang, Cyril Allauzen, Tongzhou Chen, Kilol Gupta, Ke Hu, James Qin, Yu Zhang, Yongqiang Wang, Shuo-Yiin Chang, Tara N. Sainath

In the era of large models, the autoregressive nature of decoding often results in latency serving as a significant bottleneck.

Language Modelling Large Language Model +2

Paper
Add Code

Feature Norm Regularized Federated Learning: Transforming Skewed Distributions into Global Insights

1 code implementation • 12 Dec 2023 • Ke Hu, Weidong Qiu, Peng Tang

Our comprehensive analysis reveals that FNR-FL not only accelerates convergence but also significantly surpasses other contemporary federated learning algorithms in test accuracy, particularly under feature distribution skew scenarios.

Federated Learning

Paper
Code

Improving Joint Speech-Text Representations Without Alignment

no code implementations • 11 Aug 2023 • Cal Peyser, Zhong Meng, Ke Hu, Rohit Prabhavalkar, Andrew Rosenberg, Tara N. Sainath, Michael Picheny, Kyunghyun Cho

The last year has seen astonishing progress in text-prompted image generation premised on the idea of a cross-modal representation space in which the text and image domains are represented jointly.

Speech Recognition

Paper
Add Code

Mixture-of-Expert Conformer for Streaming Multilingual ASR

no code implementations • 25 May 2023 • Ke Hu, Bo Li, Tara N. Sainath, Yu Zhang, Francoise Beaufays

We evaluate the proposed model on a set of 12 languages, and achieve an average 11. 9% relative improvement in WER over the baseline.

Automatic Speech Recognition speech-recognition +1

Paper
Add Code

A Deliberation-based Joint Acoustic and Text Decoder

no code implementations • 23 Mar 2023 • Sepand Mavandadi, Tara N. Sainath, Ke Hu, Zelin Wu

We propose a new two-pass E2E speech recognition model that improves ASR performance by training on a combination of paired data and unpaired text data.

Decoder speech-recognition +1

Paper
Add Code

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

no code implementations • 2 Mar 2023 • Yu Zhang, Wei Han, James Qin, Yongqiang Wang, Ankur Bapna, Zhehuai Chen, Nanxin Chen, Bo Li, Vera Axelrod, Gary Wang, Zhong Meng, Ke Hu, Andrew Rosenberg, Rohit Prabhavalkar, Daniel S. Park, Parisa Haghani, Jason Riesa, Ginger Perng, Hagen Soltau, Trevor Strohman, Bhuvana Ramabhadran, Tara Sainath, Pedro Moreno, Chung-Cheng Chiu, Johan Schalkwyk, Françoise Beaufays, Yonghui Wu

We introduce the Universal Speech Model (USM), a single large model that performs automatic speech recognition (ASR) across 100+ languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Massively Multilingual Shallow Fusion with Large Language Models

no code implementations • 17 Feb 2023 • Ke Hu, Tara N. Sainath, Bo Li, Nan Du, Yanping Huang, Andrew M. Dai, Yu Zhang, Rodrigo Cabrera, Zhifeng Chen, Trevor Strohman

In this work, we propose to train a single multilingual language model (LM) for shallow fusion in multiple languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Scaling Up Deliberation for Multilingual ASR

no code implementations • 11 Oct 2022 • Ke Hu, Bo Li, Tara N. Sainath

In this work, we investigate second-pass deliberation for multilingual speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Improving Deliberation by Text-Only and Semi-Supervised Training

no code implementations • 29 Jun 2022 • Ke Hu, Tara N. Sainath, Yanzhang He, Rohit Prabhavalkar, Trevor Strohman, Sepand Mavandadi, Weiran Wang

Text-only and semi-supervised training based on audio-only data has gained popularity recently due to the wide availability of unlabeled text and speech data.

Decoder Language Modelling

Paper
Add Code

Hybrid CNN Based Attention with Category Prior for User Image Behavior Modeling

no code implementations • 5 May 2022 • Xin Chen, Qingtao Tang, Ke Hu, Yue Xu, Shihang Qiu, Jia Cheng, Jun Lei

In Meituan, one of the largest e-commerce platform in China, an item is typically displayed with its image and whether a user clicks the item or not is usually influenced by its image, which implies that user's image behaviors are helpful for understanding user's visual preference and improving the accuracy of CTR prediction.

Click-Through Rate Prediction

Paper
Add Code

Streaming Align-Refine for Non-autoregressive Deliberation

no code implementations • 15 Apr 2022 • Weiran Wang, Ke Hu, Tara N. Sainath

We propose a streaming non-autoregressive (non-AR) decoding algorithm to deliberate the hypothesis alignment of a streaming RNN-T model.

Decoder

Paper
Add Code

Continual Learning for CTR Prediction: A Hybrid Approach

no code implementations • 18 Jan 2022 • Ke Hu, Yi Qi, Jianqiang Huang, Jia Cheng, Jun Lei

To address this problem, we formulate CTR prediction as a continual learning task and propose COLF, a hybrid COntinual Learning Framework for CTR prediction, which has a memory-based modular architecture that is designed to adapt, learn and give predictions continuously when faced with non-stationary drifting click data streams.

Click-Through Rate Prediction Continual Learning

Paper
Add Code

AutoHEnsGNN: Winning Solution to AutoGraph Challenge for KDD Cup 2020

1 code implementation • 25 Nov 2021 • Jin Xu, Mingjian Chen, Jianqiang Huang, Xingyuan Tang, Ke Hu, Jian Li, Jia Cheng, Jun Lei

Graph Neural Networks (GNNs) have become increasingly popular and achieved impressive results in many graph-based applications.

Graph Classification Node Classification

Paper
Code

Polarized skylight orientation determination artificial neural network

no code implementations • 6 Jul 2021 • Huaju Liang, Hongyang Bai, Ke Hu, Xinbo Lv

This paper proposes an artificial neural network to determine orientation using polarized skylight.

Paper
Add Code

Deep Position-wise Interaction Network for CTR Prediction

1 code implementation • 10 Jun 2021 • Jianqiang Huang, Ke Hu, Qingtao Tang, Mingjian Chen, Yi Qi, Jia Cheng, Jun Lei

Click-through rate (CTR) prediction plays an important role in online advertising and recommender systems.

Click-Through Rate Prediction Position +1

4,111

Paper
Code

Learning Word-Level Confidence For Subword End-to-End ASR

no code implementations • 11 Mar 2021 • David Qiu, Qiujia Li, Yanzhang He, Yu Zhang, Bo Li, Liangliang Cao, Rohit Prabhavalkar, Deepti Bhatia, Wei Li, Ke Hu, Tara N. Sainath, Ian McGraw

We study the problem of word-level confidence estimation in subword-based end-to-end (E2E) models for automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Transformer Based Deliberation for Two-Pass Speech Recognition

no code implementations • 27 Jan 2021 • Ke Hu, Ruoming Pang, Tara N. Sainath, Trevor Strohman

In this work, we explore using transformer layers instead of long-short term memory (LSTM) layers for deliberation rescoring.

Decoder speech-recognition +2

Paper
Add Code

Realtime CNN-based Keypoint Detector with Sobel Filter and CNN-based Descriptor Trained with Keypoint Candidates

no code implementations • 4 Nov 2020 • Xun Yuan, Ke Hu, Song Chen

We design Gaussian loss for the training process of SobelNet to detect corner points as keypoints.

3D Reconstruction

Paper
Add Code

Textual Echo Cancellation

no code implementations • 13 Aug 2020 • Shaojin Ding, Ye Jia, Ke Hu, Quan Wang

In this paper, we propose Textual Echo Cancellation (TEC) - a framework for cancelling the text-to-speech (TTS) playback echo from overlapping speech recordings.

Acoustic echo cancellation speech-recognition +1

Paper
Add Code

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency

no code implementations • 28 Mar 2020 • Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-Yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alex Gruenstein, Ke Hu, Minho Jin, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirko Visontai, Yonghui Wu, Yu Zhang, Ding Zhao

Thus far, end-to-end (E2E) models have not been shown to outperform state-of-the-art conventional models with respect to both quality, i. e., word error rate (WER), and latency, i. e., the time the hypothesis is finalized after the user stops speaking.

Sentence

Paper
Add Code

Deliberation Model Based Two-Pass End-to-End Speech Recognition

no code implementations • 17 Mar 2020 • Ke Hu, Tara N. Sainath, Ruoming Pang, Rohit Prabhavalkar

End-to-end (E2E) models have made rapid progress in automatic speech recognition (ASR) and perform competitively relative to conventional models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models

no code implementations • 21 Jun 2019 • Ke Hu, Antoine Bruguier, Tara N. Sainath, Rohit Prabhavalkar, Golan Pundak

Contextual automatic speech recognition, i. e., biasing recognition towards a given context (e. g. user's playlists, or contacts), is challenging in end-to-end (E2E) models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Adversarial Training for Multilingual Acoustic Modeling

no code implementations • 17 Jun 2019 • Ke Hu, Hasim Sak, Hank Liao

In this work, we apply the domain adversarial network to encourage the shared layers of a multilingual model to learn language-invariant features.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation

1 code implementation • WS 2018 • Antonio Toral, Sheila Castilho, Ke Hu, Andy Way

We reassess a recent study (Hassan et al., 2018) that claimed that machine translation (MT) has reached human parity for the translation of news from Chinese into English, using pairwise ranking and considering three variables that were not taken into account in that previous study: the language in which the source side of the test set was originally written, the translation proficiency of the evaluators, and the provision of inter-sentential context.

Machine Translation Translation

Paper
Code

A Comparative Study of Post-editing Guidelines

no code implementations • WS 2016 • Ke Hu, Patrick Cadwell

Machine Translation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.