Search Results for author: Jaehyung Seo

Found 18 papers, 0 papers with code

A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation

no code implementations • Findings (NAACL) 2022 • Jaehyung Seo, Seounghoon Lee, Chanjun Park, Yoonna Jang, Hyeonseok Moon, Sugyeong Eo, Seonmin Koo, Heuiseok Lim

However, Korean pretrained language models still struggle to generate a short sentence with a given condition based on compositionality and commonsense reasoning (i. e., generative commonsense reasoning).

Language Modelling Natural Language Understanding +2

Paper
Add Code

Priming Ancient Korean Neural Machine Translation

no code implementations • LREC 2022 • Chanjun Park, Seolhwa Lee, Jaehyung Seo, Hyeonseok Moon, Sugyeong Eo, Heuiseok Lim

In recent years, there has been an increasing need for the restoration and translation of historical languages.

Machine Translation NMT +1

Paper
Add Code

Empirical Analysis of Noising Scheme based Synthetic Data Generation for Automatic Post-editing

no code implementations • LREC 2022 • Hyeonseok Moon, Chanjun Park, Seolhwa Lee, Jaehyung Seo, Jungseob Lee, Sugyeong Eo, Heuiseok Lim

This study has several limitations, considering the data acquisition, because there is no official dataset for most language pairs.

Automatic Post-Editing Synthetic Data Generation +1

Paper
Add Code

BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text

no code implementations • ACL (WAT) 2021 • Chanjun Park, Jaehyung Seo, Seolhwa Lee, Chanhee Lee, Hyeonseok Moon, Sugyeong Eo, Heuiseok Lim

Automatic speech recognition (ASR) is arguably the most critical component of such systems, as errors in speech recognition propagate to the downstream components and drastically degrade the user experience.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Dealing with the Paradox of Quality Estimation

no code implementations • MTSummit 2021 • Sugyeong Eo, Chanjun Park, Hyeonseok Moon, Jaehyung Seo, Heuiseok Lim

In quality estimation (QE), the quality of translation can be predicted by referencing the source sentence and the machine translation (MT) output without access to the reference sentence.

Machine Translation Sentence +1

Paper
Add Code

Focus on FoCus: Is FoCus focused on Context, Knowledge and Persona?

no code implementations • CCGPK (COLING) 2022 • SeungYoon Lee, Jungseob Lee, Chanjun Park, Sugyeong Eo, Hyeonseok Moon, Jaehyung Seo, Jeongbae Park, Heuiseok Lim

As a result of the experiment, we present that the FoCus model could not correctly blend the knowledge according to the input dialogue and that the dataset design is unsuitable for the multi-turn conversation.

Dialogue Generation Question Answering

Paper
Add Code

Toward Practical Automatic Speech Recognition and Post-Processing: a Call for Explainable Error Benchmark Guideline

no code implementations • 26 Jan 2024 • Seonmin Koo, Chanjun Park, Jinsung Kim, Jaehyung Seo, Sugyeong Eo, Hyeonseok Moon, Heuiseok Lim

To effectively address this, it is imperative to consider both the speech-level, crucial for recognition accuracy, and the text-level, critical for user-friendliness.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Synthetic Alone: Exploring the Dark Side of Synthetic Data for Grammatical Error Correction

no code implementations • 26 Jun 2023 • Chanjun Park, Seonmin Koo, Seolhwa Lee, Jaehyung Seo, Sugyeong Eo, Hyeonseok Moon, Heuiseok Lim

Data-centric AI approach aims to enhance the model performance without modifying the model and has been shown to impact model performance positively.

Grammatical Error Correction

Paper
Add Code

Knowledge Graph-Augmented Korean Generative Commonsense Reasoning

no code implementations • 26 Jun 2023 • Dahyun Jung, Jaehyung Seo, Jaewook Lee, Chanjun Park, Heuiseok Lim

Generative commonsense reasoning refers to the task of generating acceptable and logical assumptions about everyday situations based on commonsense understanding.

Text Generation

Paper
Add Code

Self-Improving-Leaderboard(SIL): A Call for Real-World Centric Natural Language Processing Leaderboards

no code implementations • 20 Mar 2023 • Chanjun Park, Hyeonseok Moon, Seolhwa Lee, Jaehyung Seo, Sugyeong Eo, Heuiseok Lim

Leaderboard systems allow researchers to objectively evaluate Natural Language Processing (NLP) models and are typically used to identify models that exhibit superior performance on a given task in a predetermined setting.

Paper
Add Code

QUAK: A Synthetic Quality Estimation Dataset for Korean-English Neural Machine Translation

no code implementations • COLING 2022 • Sugyeong Eo, Chanjun Park, Hyeonseok Moon, Jaehyung Seo, Gyeongmin Kim, Jungseob Lee, Heuiseok Lim

With the recent advance in neural machine translation demonstrating its importance, research on quality estimation (QE) has been steadily progressing.

Machine Translation Translation

Paper
Add Code

Language Chameleon: Transformation analysis between languages using Cross-lingual Post-training based on Pre-trained language models

no code implementations • 14 Sep 2022 • Suhyune Son, Chanjun Park, Jungseob Lee, Midan Shim, Chanhee Lee, Yoonna Jang, Jaehyung Seo, Heuiseok Lim

This can be attributed to the fact that the amount of available training data in each language follows the power-law distribution, and most of the languages belong to the long tail of the distribution.

Cross-Lingual Transfer Transfer Learning

Paper
Add Code

A Self-Supervised Automatic Post-Editing Data Generation Tool

no code implementations • 24 Nov 2021 • Hyeonseok Moon, Chanjun Park, Sugyeong Eo, Jaehyung Seo, Seungjun Lee, Heuiseok Lim

Data building for automatic post-editing (APE) requires extensive and expert-level human effort, as it contains an elaborate process that involves identifying errors in sentences and providing suitable revisions.

Automatic Post-Editing

Paper
Add Code

A New Tool for Efficiently Generating Quality Estimation Datasets

no code implementations • 1 Nov 2021 • Sugyeong Eo, Chanjun Park, Jaehyung Seo, Hyeonseok Moon, Heuiseok Lim

Building of data for quality estimation (QE) training is expensive and requires significant human labor.

Data Augmentation

Paper
Add Code

Automatic Knowledge Augmentation for Generative Commonsense Reasoning

no code implementations • 30 Oct 2021 • Jaehyung Seo, Chanjun Park, Sugyeong Eo, Hyeonseok Moon, Heuiseok Lim

Generative commonsense reasoning is the capability of a language model to generate a sentence with a given concept-set that is based on commonsense knowledge.

Language Modelling Sentence

Paper
Add Code

How should human translation coexist with NMT? Efficient tool for building high quality parallel corpus

no code implementations • 30 Oct 2021 • Chanjun Park, Seolhwa Lee, Hyeonseok Moon, Sugyeong Eo, Jaehyung Seo, Heuiseok Lim

This paper proposes a tool for efficiently constructing high-quality parallel corpora with minimizing human labor and making this tool publicly available.

Machine Translation NMT +1

Paper
Add Code

Empirical Analysis of Korean Public AI Hub Parallel Corpora and in-depth Analysis using LIWC

no code implementations • 28 Oct 2021 • Chanjun Park, Midan Shim, Sugyeong Eo, Seolhwa Lee, Jaehyung Seo, Hyeonseok Moon, Heuiseok Lim

To the best of our knowledge, this study is the first to use LIWC to analyze parallel corpora in the field of NMT.

Machine Translation NMT +1

Paper
Add Code

PicTalky: Augmentative and Alternative Communication Software for Language Developmental Disabilities

no code implementations • 27 Sep 2021 • Chanjun Park, Yoonna Jang, Seolhwa Lee, Jaehyung Seo, Kisu Yang, Heuiseok Lim

In this study, we propose PicTalky, which is an AI-based AAC system that helps children with language developmental disabilities to improve their communication skills and language comprehension abilities.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.