1 code implementation • 29 Mar 2024 • Yafeng Chen, Siqi Zheng, Hui Wang, Luyao Cheng, Tinglong Zhu, Changhe Song, Rongjie Huang, Ziyang Ma, Qian Chen, Shiliang Zhang, Xihao Li
This paper introduces 3D-Speaker-Toolkit, an open source toolkit for multi-modal speaker verification and diarization.
1 code implementation • 10 Mar 2024 • Shiyu Xuan, Shiliang Zhang
In the scenario of long-tailed recognition, where the number of samples in each class is imbalanced, treating two types of positive samples equally leads to the biased optimization for intra-category distance.
no code implementations • 13 Feb 2024 • Ziyang Ma, Guanrou Yang, Yifan Yang, Zhifu Gao, JiaMing Wang, Zhihao Du, Fan Yu, Qian Chen, Siqi Zheng, Shiliang Zhang, Xie Chen
We found that delicate designs are not necessary, while an embarrassingly simple composition of off-the-shelf speech encoder, LLM, and the only trainable linear projector is competent for the ASR task.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 6 Feb 2024 • Songnan Yang, Xiaohui Zhang, Shiliang Zhang, Xuehui Ma, Wenqi Bai, Yushuai Li, TingWen Huang
We integrate the developed mechanism with the TA-LSTM, and calibrate the predicted heading angles to gain resistance against geomagnetic anomalies.
2 code implementations • 23 Dec 2023 • Ziyang Ma, Zhisheng Zheng, Jiaxin Ye, Jinchao Li, Zhifu Gao, Shiliang Zhang, Xie Chen
To the best of our knowledge, emotion2vec is the first universal representation model in various emotion-related tasks, filling a gap in the field.
no code implementations • 17 Dec 2023 • Daniel Gerbi Duguma, Juliana Zhang, Meysam Aboutalebi, Shiliang Zhang, Catherine Banet, Cato Bjørkli, Chinmayi Baramashetru, Frank Eliassen, HUI ZHANG, Jonathan Muringani, Josef Noll, Knut Inge Fostervold, Lars Böcker, Lee Andrew Bygrave, Matin Bagherpour, Maunya Doroudi Moghadam, Olaf Owe, Poushali Sengupta, Roman Vitenberg, Sabita Maharjan, Thiago Garrett, Yushuai Li, Zhengyu Shan
This manuscript aims to formalize and conclude the discussions initiated during the PriTEM workshop 22-23 March 2023.
2 code implementations • 14 Nov 2023 • Yunfei Chu, Jin Xu, Xiaohuan Zhou, Qian Yang, Shiliang Zhang, Zhijie Yan, Chang Zhou, Jingren Zhou
Recently, instruction-following audio-language models have received broad attention for audio interaction with humans.
Ranked #1 on Acoustic Scene Classification on TUT Acoustic Scenes 2017 (using extra training data)
1 code implementation • 8 Nov 2023 • Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Shiliang Zhang, Chong Deng, Yukun Ma, Hai Yu, Jiaqing Liu, Chong Zhang
We find that applying the conventional cross-entropy loss on input speech tokens does not consistently improve the ASR performance over the Loss Masking approach.
1 code implementation • 7 Oct 2023 • JiaMing Wang, Zhihao Du, Qian Chen, Yunfei Chu, Zhifu Gao, Zerui Li, Kai Hu, Xiaohuan Zhou, Jin Xu, Ziyang Ma, Wen Wang, Siqi Zheng, Chang Zhou, Zhijie Yan, Shiliang Zhang
In this paper, we propose LauraGPT, a unified GPT model for audio recognition, understanding, and generation.
2 code implementations • 1 Oct 2023 • Shiyu Xuan, Qingpei Guo, Ming Yang, Shiliang Zhang
Specifically, we present a new method for constructing the instruction tuning dataset at a low cost by leveraging annotations in existing datasets.
no code implementations • 26 Sep 2023 • Keyu An, Shiliang Zhang
Recently, self-attention-based transformers and conformers have been introduced as alternatives to RNNs for ASR acoustic modeling.
no code implementations • 19 Sep 2023 • Ziyang Ma, Wen Wu, Zhisheng Zheng, Yiwei Guo, Qian Chen, Shiliang Zhang, Xie Chen
In this paper, we explored how to boost speech emotion recognition (SER) with the state-of-the-art speech pre-trained model (PTM), data2vec, text generation technique, GPT-4, and speech synthesis technique, Azure TTS.
no code implementations • 19 Sep 2023 • Luyao Cheng, Siqi Zheng, Qinglin Zhang, Hui Wang, Yafeng Chen, Qian Chen, Shiliang Zhang
Speaker diarization has gained considerable attention within speech processing research community.
no code implementations • 14 Sep 2023 • Peng Wang, Yifan Yang, Zheng Liang, Tian Tan, Shiliang Zhang, Xie Chen
In spite of the excellent strides made by end-to-end (E2E) models in speech recognition in recent years, named entity recognition is still challenging but critical for semantic understanding.
1 code implementation • 14 Sep 2023 • Zhihao Du, Shiliang Zhang, Kai Hu, Siqi Zheng
We also demonstrate that the pre-trained models are suitable for downstream tasks, including automatic speech recognition and personalized text-to-speech synthesis.
1 code implementation • 14 Aug 2023 • Yu Liang, Shiliang Zhang, YaoWei Wang, Sheng Xiao, Kenli Li, Xiaoyu Wang
As a solution, backward-compatible training can be employed to avoid the necessity of updating old retrieval datasets.
no code implementations • 7 Aug 2023 • Xuehui Ma, Shiliang Zhang, Yushuai Li, Fucai Qian, TingWen Huang
This paper is concerned with the robust tracking control of linear uncertain systems, whose unknown system parameters and disturbances are bounded within ellipsoidal sets.
2 code implementations • 7 Aug 2023 • Xian Shi, Yexin Yang, Zerui Li, Yanni Chen, Zhifu Gao, Shiliang Zhang
It possesses the advantages of AED-based model's accuracy, NAR model's efficiency, and explicit customization capacity of superior performance.
1 code implementation • 5 Aug 2023 • Yafeng Chen, Siqi Zheng, Hui Wang, Luyao Cheng, Qian Chen, Shiliang Zhang
It assigns representation of augmented views of utterances to the same prototypes as the representation of the original view, thereby enabling effective knowledge transfer between the views.
no code implementations • 23 May 2023 • Yuhao Liang, Fan Yu, Yangze Li, Pengcheng Guo, Shiliang Zhang, Qian Chen, Lei Xie
The recently proposed serialized output training (SOT) simplifies multi-talker automatic speech recognition (ASR) by generating speaker transcriptions separated by a special token.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 21 May 2023 • Mohan Shi, Zhihao Du, Qian Chen, Fan Yu, Yangze Li, Shiliang Zhang, Jie Zhang, Li-Rong Dai
In addition, a two-pass decoding strategy is further proposed to fully leverage the contextual modeling ability resulting in a better recognition performance.
no code implementations • 21 May 2023 • Mohan Shi, Yuchun Shu, Lingyun Zuo, Qian Chen, Shiliang Zhang, Jie Zhang, Li-Rong Dai
For speech interaction, voice activity detection (VAD) is often used as a front-end.
1 code implementation • 19 May 2023 • Keyu An, Xian Shi, Shiliang Zhang
Recently, recurrent neural network transducer (RNN-T) gains increasing popularity due to its natural streaming capability as well as superior performance.
Ranked #9 on Speech Recognition on AISHELL-1
Automatic Speech Recognition Automatic Speech Recognition (ASR)
1 code implementation • 18 May 2023 • Zhifu Gao, Zerui Li, JiaMing Wang, Haoneng Luo, Xian Shi, Mengzhe Chen, Yabin Li, Lingyun Zuo, Zhihao Du, Zhangyu Xiao, Shiliang Zhang
FunASR offers models trained on large-scale industrial corpora and the ability to deploy them in applications.
Ranked #1 on Speech Recognition on WenetSpeech (using extra training data)
no code implementations • 18 May 2023 • Xian Shi, Haoneng Luo, Zhifu Gao, Shiliang Zhang, Zhijie Yan
Estimating confidence scores for recognition results is a classic task in ASR field and of vital importance for kinds of downstream tasks and training strategies.
1 code implementation • 8 Mar 2023 • JiaMing Wang, Zhihao Du, Shiliang Zhang
Recently, end-to-end neural diarization (EEND) is introduced and achieves promising results in speaker-overlapped scenarios.
Ranked #1 on Speaker Diarization on CALLHOME
1 code implementation • 29 Jan 2023 • Xian Shi, Yanni Chen, Shiliang Zhang, Zhijie Yan
Conventional ASR systems use frame-level phoneme posterior to conduct force-alignment~(FA) and provide timestamps, while end-to-end ASR systems especially AED based ones are short of such ability.
no code implementations • CVPR 2023 • Zhanzhou Feng, Shiliang Zhang
The accuracy of partitioned parts is on par with the capability of the pre-trained model, leading to evolved mask patterns at different training stages.
1 code implementation • ICCV 2023 • Dongkai Wang, Shiliang Zhang
This pipeline needs to transform each relative rotation matrix into a global rotation matrix to articulate the canonical mesh, and suffers from accumulated errors along the kinematics chain.
1 code implementation • 29 Nov 2022 • Xiaohuan Zhou, JiaMing Wang, Zeyu Cui, Shiliang Zhang, Zhijie Yan, Jingren Zhou, Chang Zhou
Therefore, we propose to introduce the phoneme modality into pre-training, which can help capture modality-invariant information between Mandarin speech and text.
Ranked #2 on Speech Recognition on AISHELL-1
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 27 Nov 2022 • Rinyoichi Takezoe, Xu Liu, Shunan Mao, Marco Tianyu Chen, Zhanpeng Feng, Shiliang Zhang, Xiaoyu Wang
As an important data selection schema, active learning emerges as the essential component when iterating an Artificial Intelligence (AI) model.
1 code implementation • ICCV 2023 • Ruihan Xu, Haokui Zhang, Wenze Hu, Shiliang Zhang, Xiaoyu Wang
Specifically, we propose a new convolutional neural network, ParCNetV2, that extends position-aware circular convolution (ParCNet) with oversized convolutions and bifurcate gate units to enhance attention.
no code implementations • 1 Nov 2022 • Mohan Shi, Jie Zhang, Zhihao Du, Fan Yu, Qian Chen, Shiliang Zhang, Li-Rong Dai
Speaker-attributed automatic speech recognition (SA-ASR) in multi-party meeting scenarios is one of the most valuable and challenging ASR task.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
1 code implementation • 27 Jul 2022 • Zhanpeng Feng, Shiliang Zhang, Rinyoichi Takezoe, Wenze Hu, Manmohan Chandraker, Li-Jia Li, Vijay K. Narayanan, Xiaoyu Wang
To facilitate the research in this field, this paper contributes an active learning benchmark framework named as ALBench for evaluating active learning in object detection.
2 code implementations • 16 Jun 2022 • Zhifu Gao, Shiliang Zhang, Ian McLoughlin, Zhijie Yan
However, due to an independence assumption within the output tokens, performance of single-step NAR is inferior to that of AR models, especially with a large-scale corpus.
no code implementations • 31 Mar 2022 • Fan Yu, Zhihao Du, Shiliang Zhang, Yuxiao Lin, Lei Xie
Therefore, we propose the second approach, WD-SOT, to address alignment errors by introducing a word-level diarization model, which can get rid of such timestamp alignment dependency.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 18 Mar 2022 • Zhihao Du, Shiliang Zhang, Siqi Zheng, Zhijie Yan
Through this formulation, we propose the speaker embedding-aware neural diarization (SEND) framework, where a speech encoder, a speaker encoder, two similarity scorers, and a post-processing network are jointly optimized to predict the encoded labels according to the similarities between speech features and speaker embeddings.
Ranked #1 on Speaker Diarization on AliMeeting
2 code implementations • 16 Mar 2022 • Shiliang Zhang, Dyako Fatih, Fahmi Abdulqadir, Tobias Schwarz, Xuehui Ma
Compared with its original version, the extended VED (eVED) dataset is enhanced with accurate vehicle trip GPS coordinates, serving as a basis to associate the VED trip records with external information, e. g., road speed limit and intersections, from accessible map services to accumulate attributes that is essential in analyzing vehicle energy consumption.
no code implementations • 16 Feb 2022 • Yi Ren, Ming Lei, Zhiying Huang, Shiliang Zhang, Qian Chen, Zhijie Yan, Zhou Zhao
Specifically, we first introduce a word-level prosody encoder, which quantizes the low-frequency band of the speech and compresses prosody attributes in the latent prosody vector (LPV).
no code implementations • 16 Feb 2022 • Shiliang Zhang, Xuehui Ma, Hui Cao, Tengyuan Zhao, Yajie Yu, Zhuzhu Wang
To this end, we design a lightweight approach dedicating to privatizing image database as a whole and preserving the statistical semantics of the image database to an adjustable level, while making individual images' contribution to such statistics indistinguishable.
1 code implementation • CVPR 2022 • Dongkai Wang, Shiliang Zhang
Instead of relying on person bounding boxes to spatially differentiate persons, CID decouples persons in an image into multiple instance-aware feature maps.
1 code implementation • NeurIPS 2021 • Dongkai Wang, Shiliang Zhang, Gang Hua
Instead of inferring individual keypoints, the Pose-level Inference Network (PINet) directly infers the complete pose cues for a person from his/her visible body parts.
2 code implementations • 28 Nov 2021 • Zhihao Du, Shiliang Zhang, Siqi Zheng, Weilong Huang, Ming Lei
In this paper, we reformulate this task as a single-label prediction problem by encoding the multi-speaker labels with power set.
no code implementations • 13 Nov 2021 • Shiliang Zhang
The two components collaborate to enhance learning robustness against data heterogeneities in networks.
1 code implementation • Findings (ACL) 2022 • Linhan Zhang, Qian Chen, Wen Wang, Chong Deng, Shiliang Zhang, Bing Li, Wei Wang, Xin Cao
In this work, we propose a novel unsupervised embedding-based KPE approach, Masked Document Embedding Rank (MDERank), to address this problem by leveraging a mask strategy and ranking candidates by the similarity between embeddings of the source document and the masked document.
no code implementations • 9 Sep 2021 • Siqi Zheng, Shiliang Zhang, Weilong Huang, Qian Chen, Hongbin Suo, Ming Lei, Jinwei Feng, Zhijie Yan
We propose BeamTransformer, an efficient architecture to leverage beamformer's edge in spatial filtering and transformer's capability in context sequence modeling.
no code implementations • 30 Jul 2021 • Xiaotian Yu, Hanling Yi, Yi Yu, Ling Xing, Shiliang Zhang, Xiaoyu Wang
There has been a recent surge of research interest in attacking the problem of social relation inference based on images.
2 code implementations • 22 Jul 2021 • Xiao Wang, Xiujun Shu, Shiliang Zhang, Bo Jiang, YaoWei Wang, Yonghong Tian, Feng Wu
The visible and thermal filters will be used to conduct a dynamic convolutional operation on their corresponding input feature maps respectively.
Ranked #21 on Rgb-T Tracking on RGBT234
2 code implementations • 31 May 2021 • Xiujun Shu, Xiao Wang, Xianghao Zang, Shiliang Zhang, Yuanqi Chen, Ge Li, Qi Tian
We also verified that models pre-trained on LaST can generalize well on existing datasets with short-term and cloth-changing scenarios.
1 code implementation • 11 May 2021 • Xiaobin Liu, Shiliang Zhang
Specifically, given unlabeled training images, we apply teacher networks to extract corresponding features and further construct a teacher graph for each teacher network to describe the similarity relationships among training images.
Contrastive Learning Domain Adaptive Person Re-Identification +2
no code implementations • 2 Apr 2021 • Kuan Zhu, Haiyun Guo, Shiliang Zhang, YaoWei Wang, Gaopan Huang, Honglin Qiao, Jing Liu, Jinqiao Wang, Ming Tang
In this paper, we introduce an alignment scheme in Transformer architecture for the first time and propose the Auto-Aligned Transformer (AAformer) to automatically locate both the human parts and non-human ones at patch-level.
1 code implementation • CVPR 2021 • Shiyu Xuan, Shiliang Zhang
The second stage considers the classification scores of each sample on different cameras as a new feature vector.
Ranked #1 on Person Re-Identification on SYSU-30k (using extra training data)
1 code implementation • IJCV 2021 • Shangzhi Teng, Shiliang Zhang, Qingming Huang, Nicu Sebe
Moreover, our method also achieves competitive performance compared with recent works on existing vehicle ReID datasets including VehicleID, VeRi-776 and VERI-Wild.
1 code implementation • 6 Nov 2020 • Xiaobin Liu, Shiliang Zhang
Extensive experiments on three large-scale datasets, i. e., Market-1501, DukeMTMC-reID, and MSMT17, show that our coupling optimization outperforms state-of-the-art methods by a large margin.
Domain Adaptive Person Re-Identification Transfer Learning +1
no code implementations • ECCV 2020 • Jianing Li, Shiliang Zhang
This paper tackles this challenge through jointly enforcing visual and temporal consistency in the combination of a local one-hot classification and a global multi-class classification.
1 code implementation • 10 Jul 2020 • Jianming Ye, Shiliang Zhang, Jingdong Wang
We observe that, this performance gap leads to substantial residuals between intermediate feature maps of BCNN and FCNN.
1 code implementation • 21 May 2020 • Shiliang Zhang, Zhifu Gao, Haoneng Luo, Ming Lei, Jie Gao, Zhijie Yan, Lei Xie
Recently, streaming end-to-end automatic speech recognition (E2E-ASR) has gained more and more attention.
Sound Audio and Speech Processing
no code implementations • 21 May 2020 • Haoneng Luo, Shiliang Zhang, Ming Lei, Lei Xie
Transformer models have been introduced into end-to-end speech recognition with state-of-the-art performance on various tasks owing to their superiority in modeling long-term dependencies.
no code implementations • CVPR 2020 • Dongkai Wang, Shiliang Zhang
Our label prediction and MMCL work iteratively and substantially boost the ReID performance.
Ranked #6 on Unsupervised Domain Adaptation on Duke to MSMT
no code implementations • CVPR 2020 • Yingji Zhong, Xiaoyu Wang, Shiliang Zhang
This paper also contributes a Large-Scale dataset for Person Search in the wild (LSPS), which is by far the largest and the most challenging dataset for person search.
no code implementations • 3 Oct 2019 • Kai Fan, Jiayi Wang, Bo Li, Shiliang Zhang, Boxing Chen, Niyu Ge, Zhijie Yan
The performances of automatic speech recognition (ASR) systems are usually evaluated by the metric word error rate (WER) when the manually transcribed data are provided, which are, however, expensively available in the real scenario.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • ICCV 2019 • Jianing Li, Jingdong Wang, Qi Tian, Wen Gao, Shiliang Zhang
The long-term relations are captured by a temporal self-attention model to alleviate the occlusions and noises in video sequences.
1 code implementation • 24 Jun 2019 • Shunan Mao, Shiliang Zhang, Ming Yang
RIFE adopts two feature extraction streams weighted by a dual-attention block to learn features for low and high resolution images, respectively.
no code implementations • 27 Mar 2019 • Shiliang Zhang, Ming Lei, Zhijie Yan
Results in a 20, 000 hours Mandarin speech recognition task show that the proposed spelling correction model can achieve a CER of 3. 41%, which results in 22. 9% and 53. 2% relative improvement compared to the baseline CTC-based systems decoded with and without language model respectively.
2 code implementations • CVPR 2019 • Jianzhong He, Shiliang Zhang, Ming Yang, Yanhu Shan, Tiejun Huang
Exploiting multi-scale representations is critical to improve edge detection for objects at different scales.
Ranked #2 on Edge Detection on BRIND
no code implementations • 19 Nov 2018 • Jianing Li, Shiliang Zhang, Tiejun Huang
A temporal stream in this network is constructed by inserting several Multi-scale 3D (M3D) convolution layers into a 2D CNN network.
no code implementations • 25 Jun 2018 • Xiaobin Liu, Shiliang Zhang, Qingming Huang, Wen Gao
Specifically, in addition to extracting global features, RAM also extracts features from a series of local regions.
1 code implementation • 4 Mar 2018 • Shiliang Zhang, Ming Lei, Zhijie Yan, Li-Rong Dai
In a 20000 hours Mandarin recognition task, the LFR trained DFSMN can achieve more than 20% relative improvement compared to the LFR trained BLSTM.
no code implementations • 26 Feb 2018 • Mengxiao Bi, Heng Lu, Shiliang Zhang, Ming Lei, Zhijie Yan
The Bidirectional LSTM (BLSTM) RNN based speech synthesis system is among the best parametric Text-to-Speech (TTS) systems in terms of the naturalness of generated speech, especially the naturalness in prosody.
no code implementations • 20 Dec 2017 • Jianing Li, Shiliang Zhang, Jingdong Wang, Wen Gao, Qi Tian
This paper mainly establishes a large-scale Long sequence Video database for person re-IDentification (LVreID).
25 code implementations • CVPR 2018 • Longhui Wei, Shiliang Zhang, Wen Gao, Qi Tian
Although the performance of person Re-Identification (ReID) has been significantly boosted, many challenging issues in real scenarios have not been fully investigated, e. g., the complex scenes and lighting variations, viewpoint and pose changes, and the large number of identities in a camera network.
Ranked #11 on Unsupervised Person Re-Identification on DukeMTMC-reID (Rank-10 metric)
no code implementations • ICCV 2017 • Chi Su, Jianing Li, Shiliang Zhang, Junliang Xing, Wen Gao, Qi Tian
Our deep architecture explicitly leverages the human part cues to alleviate the pose variations and learn robust feature representations from both the global image and different local parts.
Ranked #105 on Person Re-Identification on Market-1501
no code implementations • 18 Sep 2017 • Xiaobin Liu, Shiliang Zhang, Tiejun Huang, Qi Tian
To conquer these issues, we propose an End-to-End BoWs (E$^2$BoWs) model based on Deep Convolutional Neural Network (DCNN).
no code implementations • 13 Sep 2017 • Longhui Wei, Shiliang Zhang, Hantao Yao, Wen Gao, Qi Tian
Targeting to solve these problems, this work proposes a Global-Local-Alignment Descriptor (GLAD) and an efficient indexing and retrieval framework, respectively.
Ranked #93 on Person Re-Identification on Market-1501
no code implementations • 4 Jul 2017 • Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, Qi Tian
Aiming to conquer this issue, we propose a retrieval task named One-Shot Fine-Grained Instance Retrieval (OSFGIR).
no code implementations • 4 Jul 2017 • Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, Qi Tian
The representation learning risk is evaluated by the proposed part loss, which automatically generates several parts for an image, and computes the person classification loss on each part separately.
Ranked #97 on Person Re-Identification on Market-1501
1 code implementation • 19 Feb 2017 • Hantao Yao, Feng Dai, Dongming Zhang, Yike Ma, Shiliang Zhang, Yongdong Zhang, Qi Tian
Accordingly, DR$^{2}$-Net consists of two components, \emph{i. e.,} linear mapping network and residual network, respectively.
no code implementations • 11 Nov 2016 • Dan Liu, Wei. Lin, Shiliang Zhang, Si Wei, Hui Jiang
This paper describes the USTC_NELSLIP systems submitted to the Trilingual Entity Detection and Linking (EDL) track in 2016 TAC Knowledge Base Population (KBP) contests.
no code implementations • 11 May 2016 • Chi Su, Shiliang Zhang, Junliang Xing, Wen Gao, Qi Tian
And we propose a semi-supervised attribute learning framework which progressively boosts the accuracy of attributes only using a limited number of labeled data.
no code implementations • 28 Dec 2015 • Shiliang Zhang, Cong Liu, Hui Jiang, Si Wei, Li-Rong Dai, Yu Hu
In this paper, we propose a novel neural network structure, namely \emph{feedforward sequential memory networks (FSMN)}, to model long-term dependency in time series without using recurrent feedback.
no code implementations • ICCV 2015 • Chi Su, Fan Yang, Shiliang Zhang, Qi Tian, Larry S. Davis, Wen Gao
Since attributes are generally correlated, we introduce a low rank attribute embedding into the MTL formulation to embed original binary attributes to a continuous attribute space, where incorrect and incomplete attributes are rectified and recovered to better describe people.
no code implementations • 9 Oct 2015 • ShiLiang Zhang, Hui Jiang, Si Wei, Li-Rong Dai
We introduce a new structure for memory neural networks, called feedforward sequential memory networks (FSMN), which can learn long-term dependency without using recurrent feedback.
1 code implementation • 6 May 2015 • Shiliang Zhang, Hui Jiang, MingBin Xu, JunFeng Hou, Li-Rong Dai
In this paper, we propose the new fixed-size ordinally-forgetting encoding (FOFE) method, which can almost uniquely encode any variable-length sequence of words into a fixed-size representation.
no code implementations • 3 Feb 2015 • Shiliang Zhang, Hui Jiang
As a result, the HOPE framework can be used as a novel tool to probe why and how NNs work, more importantly, to learn NNs in either supervised or unsupervised ways.
Ranked #23 on Image Classification on MNIST