Search Results for author: Seungone Kim

Found 15 papers, 12 papers with code

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

1 code implementation • 2 May 2024 • Seungone Kim, Juyoung Suk, Shayne Longpre, Bill Yuchen Lin, Jamin Shin, Sean Welleck, Graham Neubig, Moontae Lee, Kyungjae Lee, Minjoon Seo

Proprietary LMs such as GPT-4 are often employed to assess the quality of responses from various LMs.

Language Modelling

467

Paper
Code

Self-Explore to Avoid the Pit: Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards

1 code implementation • 16 Apr 2024 • Hyeonbin Hwang, Doyoung Kim, Seungone Kim, Seonghyeon Ye, Minjoon Seo

Training on large amounts of rationales (i. e., CoT Fine-tuning) is effective at improving the reasoning capabilities of large language models (LLMs).

GSM8K Math

Paper
Code

Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models

no code implementations • 3 Apr 2024 • Hyungjoo Chae, Yeonghyeon Kim, Seungone Kim, Kai Tzu-iunn Ong, Beong-woo Kwak, Moohyeon Kim, SeongHwan Kim, Taeyoon Kwon, Jiwan Chung, Youngjae Yu, Jinyoung Yeo

Also, we show that compared to natural language, pseudocode can better guide the reasoning of LMs, even though they are trained to follow natural language instructions.

Paper
Add Code

Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?

1 code implementation • 18 Feb 2024 • Guijin Son, Sangwon Baek, Sangdae Nam, Ilgyun Jeong, Seungone Kim

Large language models (LLMs) are typically prompted to follow a single instruction per inference call.

Paper
Code

KMMLU: Measuring Massive Multitask Language Understanding in Korean

no code implementations • 18 Feb 2024 • Guijin Son, Hanwool Lee, Sungdong Kim, Seungone Kim, Niklas Muennighoff, Taekyoon Choi, Cheonbok Park, Kang Min Yoo, Stella Biderman

We propose KMMLU, a new Korean benchmark with 35, 030 expert-level multiple-choice questions across 45 subjects ranging from humanities to STEM.

Language Modelling Multiple-choice

Paper
Add Code

LangBridge: Multilingual Reasoning Without Multilingual Supervision

no code implementations • 19 Jan 2024 • Dongkeun Yoon, Joel Jang, Sungdong Kim, Seungone Kim, Sheikh Shafayat, Minjoon Seo

We introduce LangBridge, a zero-shot approach to adapt language models for multilingual reasoning tasks without multilingual supervision.

Logical Reasoning Mathematical Reasoning

Paper
Add Code

Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation

1 code implementation • 12 Jan 2024 • Seongyun Lee, Seungone Kim, Sue Hyun Park, Geewook Kim, Minjoon Seo

Assessing long-form responses generated by Vision-Language Models (VLMs) is challenging.

Language Modelling

Paper
Code

Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging

1 code implementation • 17 Oct 2023 • Joel Jang, Seungone Kim, Bill Yuchen Lin, Yizhong Wang, Jack Hessel, Luke Zettlemoyer, Hannaneh Hajishirzi, Yejin Choi, Prithviraj Ammanabrolu

In this work, we study Reinforcement Learning from Personalized Human Feedback (RLPHF) problem, wherein LLMs are aligned to multiple (sometimes conflicting) preferences by modeling alignment as a Multi-Objective Reinforcement Learning (MORL) problem.

Language Modelling Large Language Model +2

Paper
Code

Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

2 code implementations • 12 Oct 2023 • Seungone Kim, Jamin Shin, Yejin Cho, Joel Jang, Shayne Longpre, Hwaran Lee, Sangdoo Yun, Seongjin Shin, Sungdong Kim, James Thorne, Minjoon Seo

We first construct the Feedback Collection, a new dataset that consists of 1K fine-grained score rubrics, 20K instructions, and 100K responses and language feedback generated by GPT-4.

Language Modelling Large Language Model

4,323

Paper
Code

FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets

1 code implementation • 20 Jul 2023 • Seonghyeon Ye, Doyoung Kim, Sungdong Kim, Hyeonbin Hwang, Seungone Kim, Yongrae Jo, James Thorne, Juho Kim, Minjoon Seo

Evaluation of Large Language Models (LLMs) is challenging because instruction-following necessitates alignment with human values and the required set of skills varies depending on the instruction.

Instruction Following Language Modelling

191

Paper
Code

The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning

2 code implementations • 23 May 2023 • Seungone Kim, Se June Joo, Doyoung Kim, Joel Jang, Seonghyeon Ye, Jamin Shin, Minjoon Seo

Furthermore, we show that instruction tuning with CoT Collection allows LMs to possess stronger few-shot learning capabilities on 4 domain-specific tasks, resulting in an improvement of +2. 24% (Flan-T5 3B) and +2. 37% (Flan-T5 11B), even outperforming ChatGPT utilizing demonstrations until the max length by a +13. 98% margin.

Ranked #1 on on BIG-bench (SNARKS)

Common Sense Reasoning Common Sense Reasoning (Zero-Shot) +7

191

Paper
Code

CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification

1 code implementation • 7 Mar 2023 • Seungone Kim, Se June Joo, Yul Jang, Hyungjoo Chae, Jinyoung Yeo

To improve the correctness of the explanations, fine-tuning language models with explanation data is needed.

Paper
Code

Exploring the Benefits of Training Expert Language Models over Instruction Tuning

2 code implementations • 7 Feb 2023 • Joel Jang, Seungone Kim, Seonghyeon Ye, Doyoung Kim, Lajanugen Logeswaran, Moontae Lee, Kyungjae Lee, Minjoon Seo

Recently, Language Models (LMs) instruction-tuned on multiple tasks, also known as multitask-prompted fine-tuning (MT), have shown the capability to generalize to unseen tasks.

Ranked #9 on Question Answering on StoryCloze

Common Sense Reasoning Coreference Resolution +4

Paper
Code

Mind the Gap! Injecting Commonsense Knowledge for Abstractive Dialogue Summarization

1 code implementation • COLING 2022 • Seungone Kim, Se June Joo, Hyungjoo Chae, Chaehyeong Kim, Seung-won Hwang, Jinyoung Yeo

In this paper, we propose to leverage the unique characteristics of dialogues sharing commonsense knowledge across participants, to resolve the difficulties in summarizing them.

Ranked #2 on Text Summarization on DialogSum

Abstractive Dialogue Summarization Multi-Task Learning +1

Paper
Code

Can Language Models perform Abductive Commonsense Reasoning?

1 code implementation • 7 Jul 2022 • Seungone Kim

Abductive Reasoning is a task of inferring the most plausible hypothesis given a set of observations.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.