no code implementations • 9 Nov 2023 • Jangwhan Lee, Minsoo Kim, SeungCheol Baek, Seok Joong Hwang, Wonyong Sung, Jungwook Choi
Large Language Models (LLMs) are proficient in natural language processing tasks, but their deployment is often restricted by extensive parameter sizes and computational demands.
1 code implementation • NeurIPS 2023 • Minsoo Kim, Sihwa Lee, Janghwan Lee, Sukjin Hong, Du-Seong Chang, Wonyong Sung, Jungwook Choi
Generative Language Models (GLMs) have shown impressive performance in tasks such as text generation, understanding, and reasoning.
1 code implementation • 23 Feb 2023 • Minsoo Kim, Kyuhong Shim, Seongmin Park, Wonyong Sung, Jungwook Choi
Pre-trained Transformer models such as BERT have shown great success in a wide range of applications, but at the cost of substantial increases in model complexity.
no code implementations • 17 Feb 2023 • Iksoo Choi, Wonyong Sung
As sleep disorders are becoming more prevalent there is an urgent need to classify sleep stages in a less disturbing way. In particular, sleep-stage classification using simple sensors, such as single-channel electroencephalography (EEG), electrooculography (EOG), electromyography (EMG), or electrocardiography (ECG) has gained substantial interest.
no code implementations • 29 Jan 2023 • Kyuhong Shim, Jungwook Choi, Wonyong Sung
In this paper, we provide a comprehensive study on attention map reuse focusing on its ability to accelerate inference.
no code implementations • 29 Dec 2022 • Chanwoo Kim, Sathish Indurti, Jinhwan Park, Wonyong Sung
In our work, we define a macro-block that contains a large number of units from the input to a Recurrent Neural Network (RNN).
no code implementations • 1 Oct 2022 • Kyuhong Shim, Wonyong Sung
Our analyses show that Transformer and Conformer models benefit from the long-range accessibility of self-attention through input frames.
no code implementations • 19 Mar 2022 • Kyuhong Shim, Wonyong Sung
Especially, SA heads in lower layers capture various phonetic characteristics by the query-key dot product, which is designed to compute the pairwise relationship between frames.
no code implementations • 22 Feb 2022 • Kyuhong Shim, Hyewon Bae, Wonyong Sung
Although the common approach is to use the same tokenization method for external LM as the ASR model, we show that it may not be the best choice for Korean.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 2021 IEEE Workshop on Signal Processing Systems (SiPS) 2021 • Seokhyeon Choi, Kyuhong Shim, Jungwook Choi, Wonyong Sung, Byonghyo Shim
We propose TernGEMM, a special GEMM library using SIMD instructions for Deep Neural Network (DNN) inference with ternary weights and activations under 8-bit.
1 code implementation • 2021 18th International SoC Design Conference (ISOCC) 2021 • Kyuhong Shim, Iksoo Choi, Wonyong Sung, Jungwook Choi
While Transformer-based models have shown impressive language modeling performance, the large computation cost is often prohibitive for practical use.
no code implementations • ICLR 2022 • Kyuhong Shim, Jungwook Choi, Wonyong Sung
Self-attention (SA) is a critical component of Transformer neural networks that have succeeded in automatic speech recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 30 Sep 2020 • Yoonho Boo, Sungho Shin, Jungwook Choi, Wonyong Sung
In this study, we propose stochastic precision ensemble training for QDNNs (SPEQ).
no code implementations • 5 Sep 2020 • Wonyong Sung, Iksoo Choi, Jinhwan Park, Seokhyun Choi, Sungho Shin
The proposed method is compared with the conventional SGD method and previous weight-noise injection algorithms using convolutional neural networks for image classification.
no code implementations • 31 May 2020 • Yoonho Boo, Sungho Shin, Wonyong Sung
This study proposes a holistic approach for the optimization of QDNNs, which contains QDNN training methods as well as quantization-friendly architecture design.
no code implementations • 2 Feb 2020 • Sungho Shin, Yoonho Boo, Wonyong Sung
Model averaging is a promising approach for achieving the good generalization capability of DNNs, especially when the loss surface for training contains many sharp minima.
no code implementations • 4 Sep 2019 • Sungho Shin, Yoonho Boo, Wonyong Sung
Knowledge distillation (KD) is a very popular method for model size reduction.
no code implementations • NeurIPS 2018 • Jinhwan Park, Yoonho Boo, Iksoo Choi, Sungho Shin, Wonyong Sung
The RNN implementation on embedded devices can suffer from excessive DRAM accesses because the parameter size of a neural network usually exceeds that of the cache memory and the parameters are used only once for each time step.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 27 Sep 2018 • Wonyong Sung, Lukas Lee, Jinwhan Park
In addition, we explore neural networks that equip one-dimensional (1-D) convolution at each layer of these algorithms, and by which can obtain a very large performance increase in the QRNNs and Gated ConvNets.
no code implementations • 30 Mar 2018 • Wonyong Sung, Jinhwan Park
As neural network algorithms show high performance in many applications, their efficient inference on mobile and embedded systems are of great interests.
no code implementations • NeurIPS 2017 • Kyuhong Shim, Minjae Lee, Iksoo Choi, Yoonho Boo, Wonyong Sung
The approximate probability of each word can be estimated with only a small part of the weight matrix by using a few large singular values and the corresponding elements for most of the words.
no code implementations • 1 Jul 2017 • Yoonho Boo, Wonyong Sung
Deep neural networks (DNNs) usually demand a large amount of operations for real-time inference.
no code implementations • 27 Feb 2017 • Sungho Shin, Yoonho Boo, Wonyong Sung
Fixed-point optimization of deep neural networks plays an important role in hardware based design and low-power implementations.
no code implementations • 19 Nov 2016 • Sungho Shin, Kyuyeon Hwang, Wonyong Sung
The complexity of deep neural network algorithms for hardware implementation can be lowered either by scaling the number of units or reducing the word-length of weights.
no code implementations • 30 Oct 2016 • Sajid Anwar, Wonyong Sung
We propose feature map and kernel level pruning for reducing the computational complexity of a deep convolutional neural network.
no code implementations • 30 Sep 2016 • Minjae Lee, Kyuyeon Hwang, Jinhwan Park, Sungwook Choi, Sungho Shin, Wonyong Sung
The weights are quantized to 6 bits to store all of them in the on-chip memory of an FPGA.
no code implementations • 13 Sep 2016 • Kyuyeon Hwang, Wonyong Sung
Recurrent neural network (RNN) based character-level language models (CLMs) are extremely useful for modeling out-of-vocabulary words by nature.
no code implementations • 14 Aug 2016 • Sungho Shin, Wonyong Sung
Gesture recognition is a very essential technology for many wearable devices.
no code implementations • 14 Aug 2016 • Sungho Shin, Kyuyeon Hwang, Wonyong Sung
In this paper, we propose a generative knowledge transfer technique that trains an RNN based language model (student network) using text and output probabilities generated from a previously trained RNN (teacher network).
no code implementations • 4 Feb 2016 • Jinhwan Park, Wonyong Sung
In this work, we have developed an FPGA based fixed-point DNN system using only on-chip memory not to access external DRAM.
1 code implementation • 25 Jan 2016 • Kyuyeon Hwang, Wonyong Sung
The output values of the CTC-trained RNN are character-level probabilities, which are processed by beam search decoding.
no code implementations • 30 Dec 2015 • Kyuyeon Hwang, Minjae Lee, Wonyong Sung
In this paper, we propose a context-aware keyword spotting model employing a character-level recurrent neural network (RNN) for spoken term detection in continuous speech.
1 code implementation • 29 Dec 2015 • Sajid Anwar, Kyuyeon Hwang, Wonyong Sung
To decide the importance of network connections and paths, the proposed method uses a particle filtering approach.
no code implementations • 4 Dec 2015 • Sungho Shin, Kyuyeon Hwang, Wonyong Sung
Recurrent neural networks have shown excellent performance in many applications, however they require increased complexity in hardware or software based implementations.
no code implementations • 21 Nov 2015 • Kyuyeon Hwang, Wonyong Sung
Our online model achieves 20. 7% phoneme error rate (PER) on the very long input sequence that is generated by concatenating all 192 utterances in the TIMIT core test set.
no code implementations • 20 Nov 2015 • Wonyong Sung, Sungho Shin, Kyuyeon Hwang
In this work, the effects of retraining are analyzed for a feedforward deep neural network (FFDNN) and a convolutional neural network (CNN).
no code implementations • 10 Mar 2015 • Kyuyeon Hwang, Wonyong Sung
Recurrent neural networks (RNNs) have shown outstanding performance on processing sequence data.