Search Results for author: Kyungjae Lee

Found 32 papers, 9 papers with code

Reinforcement Learning from Reflective Feedback (RLRF): Aligning and Improving LLMs via Fine-Grained Self-Reflection

no code implementations21 Mar 2024 Kyungjae Lee, Dasol Hwang, Sunghyun Park, Youngsoo Jang, Moontae Lee

Despite the promise of RLHF in aligning LLMs with human preferences, it often leads to superficial alignment, prioritizing stylistic changes over improving downstream performance of LLMs.

Mathematical Reasoning

Pitfall of Optimism: Distributional Reinforcement Learning by Randomizing Risk Criterion

no code implementations NeurIPS 2023 Taehyun Cho, Seungyub Han, Heesoo Lee, Kyungjae Lee, Jungwoo Lee

Distributional reinforcement learning algorithms have attempted to utilize estimated uncertainty for exploration, such as optimism in the face of uncertainty.

Distributional Reinforcement Learning reinforcement-learning

PreWoMe: Exploiting Presuppositions as Working Memory for Long Form Question Answering

no code implementations24 Oct 2023 Wookje Han, Jinsol Park, Kyungjae Lee

Information-seeking questions in long-form question answering (LFQA) often prove misleading due to ambiguity or false presupposition in the question.

Long Form Question Answering

SPOTS: Stable Placement of Objects with Reasoning in Semi-Autonomous Teleoperation Systems

no code implementations25 Sep 2023 Joonhyung Lee, Sangbeom Park, Jeongeun Park, Kyungjae Lee, Sungjoon Choi

Particularly, we focus on two aspects of the place task: stability robustness and contextual reasonableness of object placements.

On Monotonic Aggregation for Open-domain QA

1 code implementation8 Aug 2023 Sang-eun Han, Yeonseok Jeong, Seung-won Hwang, Kyungjae Lee

Our experiments show that our framework not only ensures monotonicity, but also outperforms state-of-the-art multi-source QA methods on Natural Questions.

Language Modelling Natural Questions +4

When to Read Documents or QA History: On Unified and Selective Open-domain QA

no code implementations7 Jun 2023 Kyungjae Lee, Sang-eun Han, Seung-won Hwang, Moontae Lee

This paper studies the problem of open-domain question answering, with the aim of answering a diverse range of questions leveraging knowledge resources.

Natural Questions Open-Domain Question Answering +2

Evidentiality-aware Retrieval for Overcoming Abstractiveness in Open-Domain Question Answering

no code implementations6 Apr 2023 Yongho Song, Dahyun Lee, Myungha Jang, Seung-won Hwang, Kyungjae Lee, Dongha Lee, Jinyeong Yeo

The long-standing goal of dense retrievers in abtractive open-domain question answering (ODQA) tasks is to learn to capture evidence passages among relevant passages for any given query, such that the reader produce factually correct outputs from evidence passages.

Contrastive Learning counterfactual +4

Exploring the Benefits of Training Expert Language Models over Instruction Tuning

2 code implementations7 Feb 2023 Joel Jang, Seungone Kim, Seonghyeon Ye, Doyoung Kim, Lajanugen Logeswaran, Moontae Lee, Kyungjae Lee, Minjoon Seo

Recently, Language Models (LMs) instruction-tuned on multiple tasks, also known as multitask-prompted fine-tuning (MT), have shown the capability to generalize to unseen tasks.

Common Sense Reasoning Coreference Resolution +4

Trust Region-Based Safe Distributional Reinforcement Learning for Multiple Constraints

1 code implementation NeurIPS 2023 Dohyeong Kim, Kyungjae Lee, Songhwai Oh

In safety-critical robotic tasks, potential failures must be reduced, and multiple constraints must be met, such as avoiding collisions, limiting energy consumption, and maintaining balance.

Distributional Reinforcement Learning reinforcement-learning +2

Look Around for Anomalies: Weakly-Supervised Anomaly Detection via Context-Motion Relational Learning

no code implementations CVPR 2023 MyeongAh Cho, Minjung Kim, Sangwon Hwang, Chaewon Park, Kyungjae Lee, Sangyoun Lee

Furthermore, as the relationship between context and motion is important in order to identify the anomalies in complex and diverse scenes, we propose a Context--Motion Interrelation Module (CoMo), which models the relationship between the appearance of the surroundings and motion, rather than utilizing only temporal dependencies or motion information.

Relational Reasoning Supervised Anomaly Detection +2

Plug-and-Play Adaptation for Continuously-updated QA

no code implementations Findings (ACL) 2022 Kyungjae Lee, Wookje Han, Seung-won Hwang, Hwaran Lee, Joonsuk Park, Sang-Woo Lee

To this end, we first propose a novel task--Continuously-updated QA (CuQA)--in which multiple large-scale updates are made to LMs, and the performance is measured with respect to the success in adding and updating knowledge while retaining existing knowledge.

Domain Generalization by Mutual-Information Regularization with Pre-trained Models

1 code implementation21 Mar 2022 Junbum Cha, Kyungjae Lee, Sungrae Park, Sanghyuk Chun

Domain generalization (DG) aims to learn a generalized model to an unseen target domain using only limited source domains.

Domain Generalization

Neural Markov Controlled SDE: Stochastic Optimization for Continuous-Time Data

no code implementations ICLR 2022 Sung Woo Park, Kyungjae Lee, Junseok Kwon

We propose a novel probabilistic framework for modeling stochastic dynamics with the rigorous use of stochastic optimal control theory.

Stochastic Optimization Time Series +1

Semi-Autonomous Teleoperation via Learning Non-Prehensile Manipulation Skills

no code implementations27 Sep 2021 Sangbeom Park, Yoonbyung Chai, Sunghyun Park, Jeongeun Park, Kyungjae Lee, Sungjoon Choi

In this paper, we present a semi-autonomous teleoperation framework for a pick-and-place task using an RGB-D sensor.

Query Generation for Multimodal Documents

no code implementations EACL 2021 Kyungho Kim, Kyungjae Lee, Seung-won Hwang, Young-In Song, SeungWook Lee

This paper studies the problem of generatinglikely queries for multimodal documents withimages.

Retrieval

Optimal Algorithms for Stochastic Multi-Armed Bandits with Heavy Tailed Rewards

no code implementations NeurIPS 2020 Kyungjae Lee, Hongjun Yang, Sungbin Lim, Songhwai Oh

In simulation, the proposed estimator shows favorable performance compared to existing robust estimators for various $p$ values and, for MAB problems, the proposed perturbation strategy outperforms existing exploration methods.

Multi-Armed Bandits

Relational Deep Feature Learning for Heterogeneous Face Recognition

no code implementations2 Mar 2020 MyeongAh Cho, Taeoh Kim, Ig-Jae Kim, Kyungjae Lee, Sangyoun Lee

Due to the lack of databases, HFR methods usually exploit the pre-trained features on a large-scale visual database that contain general facial information.

Face Recognition Heterogeneous Face Recognition

Categorical Metadata Representation for Customized Text Classification

2 code implementations TACL 2019 Jihyeok Kim, Reinald Kim Amplayo, Kyungjae Lee, Sua Sung, Minji Seo, Seung-won Hwang

The performance of text classification has improved tremendously using intelligently engineered neural-based models, especially those injecting categorical metadata as additional information, e. g., using user/product information for sentiment classification.

Ranked #4 on Sentiment Analysis on User and product information (Yelp 2013 (Acc) metric)

General Classification Sentence +5

Tsallis Reinforcement Learning: A Unified Framework for Maximum Entropy Reinforcement Learning

no code implementations31 Jan 2019 Kyungjae Lee, Sungyub Kim, Sungbin Lim, Sungjoon Choi, Songhwai Oh

By controlling the entropic index, we can generate various types of entropy, including the SG entropy, and a different entropy results in a different class of the optimal policy in Tsallis MDPs.

reinforcement-learning Reinforcement Learning (RL)

ChoiceNet: Robust Learning by Revealing Output Correlations

no code implementations27 Sep 2018 Sungjoon Choi, Sanghoon Hong, Kyungjae Lee, Sungbin Lim

To this end, we present a novel framework referred to here as ChoiceNet that can robustly infer the target distribution in the presence of inconsistent data.

regression

Maximum Causal Tsallis Entropy Imitation Learning

no code implementations NeurIPS 2018 Kyungjae Lee, Sungjoon Choi, Songhwai Oh

Third, we propose a maximum causal Tsallis entropy imitation learning (MCTEIL) algorithm with a sparse mixture density network (sparse MDN) by modeling mixture weights using a sparsemax distribution.

Imitation Learning

Task Agnostic Robust Learning on Corrupt Outputs by Correlation-Guided Mixture Density Networks

1 code implementation CVPR 2020 Sungjoon Choi, Sanghoon Hong, Kyungjae Lee, Sungbin Lim

In this paper, we focus on weakly supervised learning with noisy training data for both classification and regression problems. We assume that the training outputs are collected from a mixture of a target and correlated noise distributions. Our proposed method simultaneously estimates the target distribution and the quality of each data which is defined as the correlation between the target and data generating distributions. The cornerstone of the proposed method is a Cholesky Block that enables modeling dependencies among mixture distributions in a differentiable manner where we maintain the distribution over the network weights. We first provide illustrative examples in both regression and classification tasks to show the effectiveness of the proposed method. Then, the proposed method is extensively evaluated in a number of experiments where we show that it constantly shows comparable or superior performances compared to existing baseline methods in the handling of noisy data.

Autonomous Driving General Classification +2

Uncertainty-Aware Learning from Demonstration using Mixture Density Networks with Sampling-Free Variance Modeling

1 code implementation3 Sep 2017 Sungjoon Choi, Kyungjae Lee, Sungbin Lim, Songhwai Oh

The proposed uncertainty-aware learning from demonstration method outperforms other compared methods in terms of safety using a complex real-world driving dataset.

Autonomous Driving

Density Matching Reward Learning

no code implementations12 Aug 2016 Sungjoon Choi, Kyungjae Lee, Andy Park, Songhwai Oh

The performance of KDMRL is extensively evaluated in two sets of experiments: grid world and track driving experiments.

Autonomous Navigation

Cannot find the paper you are looking for? You can Submit a new open access paper.