Search Results for author: Sungjoo Byun

Found 7 papers, 0 papers with code

A Study on How Attention Scores in the BERT Model are Aware of Lexical Categories in Syntactic and Semantic Tasks on the GLUE Benchmark

no code implementations • 25 Mar 2024 • Dongjun Jang, Sungjoo Byun, Hyopil Shin

This study examines whether the attention scores between tokens in the BERT model significantly vary based on lexical categories during the fine-tuning process for downstream tasks.

Paper
Add Code

KIT-19: A Comprehensive Korean Instruction Toolkit on 19 Tasks for Fine-Tuning Korean Large Language Models

no code implementations • 25 Mar 2024 • Dongjun Jang, Sungjoo Byun, Hyemi Jo, Hyopil Shin

Based on the its quality and empirical results, this paper proposes that \textit{KIT-19} has the potential to make a substantial contribution to the future improvement of Korean LLMs' performance.

Paper
Add Code

Korean Bio-Medical Corpus (KBMC) for Medical Named Entity Recognition

no code implementations • 24 Mar 2024 • Sungjoo Byun, Jiseung Hong, Sumin Park, Dongjun Jang, Jean Seo, Minseok Kim, Chaeyoung Oh, Hyopil Shin

Named Entity Recognition (NER) plays a pivotal role in medical Natural Language Processing (NLP).

Medical Named Entity Recognition named-entity-recognition +2

Paper
Add Code

CARBD-Ko: A Contextually Annotated Review Benchmark Dataset for Aspect-Level Sentiment Classification in Korean

no code implementations • 23 Feb 2024 • Dongjun Jang, Jean Seo, Sungjoo Byun, Taekyoung Kim, Minseok Kim, Hyopil Shin

In order to tackle these challenges, we introduce CARBD-Ko (a Contextually Annotated Review Benchmark Dataset for Aspect-Based Sentiment Classification in Korean), a benchmark dataset that incorporates aspects and dual-tagged polarities to distinguish between aspect-specific and aspect-agnostic sentiment classification.

Classification Hallucination +2

Paper
Add Code

Automatic Construction of a Korean Toxic Instruction Dataset for Ethical Tuning of Large Language Models

no code implementations • 30 Nov 2023 • Sungjoo Byun, Dongjun Jang, Hyemi Jo, Hyopil Shin

Caution: this paper may include material that could be offensive or distressing.

Paper
Add Code

Mergen: The First Manchu-Korean Machine Translation Model Trained on Augmented Data

no code implementations • 29 Nov 2023 • Jean Seo, Sungjoo Byun, Minha Kang, Sangah Lee

The Manchu language, with its roots in the historical Manchurian region of Northeast China, is now facing a critical threat of extinction, as there are very few speakers left.

Machine Translation Translation

Paper
Add Code

DaG LLM ver 1.0: Pioneering Instruction-Tuned Language Modeling for Korean NLP

no code implementations • 23 Nov 2023 • Dongjun Jang, Sangah Lee, Sungjoo Byun, Jinwoong Kim, Jean Seo, Minseok Kim, Soyeon Kim, Chaeyoung Oh, Jaeyoon Kim, Hyemi Jo, Hyopil Shin

This paper presents the DaG LLM (David and Goliath Large Language Model), a language model specialized for Korean and fine-tuned through Instruction Tuning across 41 tasks within 13 distinct categories.

Language Modelling Large Language Model

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.