Search Results for author: Tian Tan

Found 11 papers, 4 papers with code

SALMONN: Towards Generic Hearing Abilities for Large Language Models

1 code implementation20 Oct 2023 Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Hearing is arguably an essential ability of artificial intelligence (AI) agents in the physical world, which refers to the perception and understanding of general auditory information consisting of at least three types of sounds: speech, audio events, and music.

Audio captioning Automatic Speech Recognition +10

Synthetic IMU Datasets and Protocols Can Simplify Fall Detection Experiments and Optimize Sensor Configuration

no code implementations16 Oct 2023 Jie Tang, Bin He, Junkai Xu, Tian Tan, Zhipeng Wang, Yanmin Zhou, Shuo Jiang

The proposed method simplifies fall detection data acquisition experiments, provides novel venue for generating low cost synthetic data in scenario where acquiring data for machine learning is challenging and paves the way for customizing machine learning configurations.

Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models

2 code implementations9 Oct 2023 Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Audio-visual large language models (LLM) have drawn significant attention, yet the fine-grained combination of both input streams is rather under-explored, which is challenging but necessary for LLMs to understand general video inputs.

Question Answering Video Question Answering

Connecting Speech Encoder and Large Language Model for ASR

no code implementations25 Sep 2023 Wenyi Yu, Changli Tang, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Q-Former-based LLMs can generalise well to out-of-domain datasets, where 12% relative WER reductions over the Whisper baseline ASR model were achieved on the Eval2000 test set without using any in-domain training data from Switchboard.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer

no code implementations14 Sep 2023 Peng Wang, Yifan Yang, Zheng Liang, Tian Tan, Shiliang Zhang, Xie Chen

In spite of the excellent strides made by end-to-end (E2E) models in speech recognition in recent years, named entity recognition is still challenging but critical for semantic understanding.

Language Modelling named-entity-recognition +3

Multi-Modality Deep Network for Extreme Learned Image Compression

no code implementations26 Apr 2023 Xuhao Jiang, Weimin Tan, Tian Tan, Bo Yan, Liquan Shen

Image-based single-modality compression learning approaches have demonstrated exceptionally powerful encoding and decoding capabilities in the past few years , but suffer from blur and severe semantics loss at extremely low bitrates.

Image Compression

Success-Rate Targeted Reinforcement Learning by Disorientation Penalty

no code implementations1 Jan 2021 Haichuan Gao, Zhile Yang, Tian Tan, Feng Chen

Unfortunately, applying traditional Bellman updates to value function learning can be problematic for learning undiscounted return, and thus not suitable for optimizing success rate.

Decision Making Q-Learning +2

An Investigation on Deep Learning with Beta Stabilizer

no code implementations31 Jul 2020 Qi Liu, Tian Tan, Kai Yu

It is concluded that beta stabilizer parameters can reduce the sensitivity of learning rate with almost the same performance on DNN with relu activation function and LSTM.

Handwriting Recognition speech-recognition +1

Generating Adjacency-Constrained Subgoals in Hierarchical Reinforcement Learning

1 code implementation NeurIPS 2020 Tianren Zhang, Shangqi Guo, Tian Tan, Xiaolin Hu, Feng Chen

In this paper, we show that this problem can be effectively alleviated by restricting the high-level action space from the whole goal space to a $k$-step adjacent region of the current state using an adjacency constraint.

Continuous Control Hierarchical Reinforcement Learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.