Search Results for author: Seungyeon Kim

Found 17 papers, 0 papers with code

Teacher Guided Training: An Efficient Framework for Knowledge Transfer

no code implementations14 Aug 2022 Manzil Zaheer, Ankit Singh Rawat, Seungyeon Kim, Chong You, Himanshu Jain, Andreas Veit, Rob Fergus, Sanjiv Kumar

In this paper, we propose the teacher-guided training (TGT) framework for training a high-quality compact model that leverages the knowledge acquired by pretrained generative models, while obviating the need to go through a large volume of data.

Generalization Bounds Image Classification +4

In defense of dual-encoders for neural ranking

no code implementations29 Sep 2021 Aditya Krishna Menon, Sadeep Jayasumana, Seungyeon Kim, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar

Transformer-based models such as BERT have proven successful in information retrieval problem, which seek to identify relevant documents for a given query.

Information Retrieval Natural Questions +1

A Statistical Manifold Framework for Point Cloud Data

no code implementations29 Sep 2021 Yonghyeon LEE, Seungyeon Kim, Jinwon Choi, Frank C. Park

The only requirement on the part of the user is the choice of a meaningful underlying probability distribution, which is more intuitive and natural to make than what is required in existing ad hoc formulations.

Balancing Robustness and Sensitivity using Feature Contrastive Learning

no code implementations19 May 2021 Seungyeon Kim, Daniel Glasner, Srikumar Ramalingam, Cho-Jui Hsieh, Kishore Papineni, Sanjiv Kumar

It is generally believed that robust training of extremely large networks is critical to their success in real-world applications.

Contrastive Learning

RankDistil: Knowledge Distillation for Ranking

no code implementations AISTATS 2021 Sashank J. Reddi, Rama Kumar Pasumarthi, Aditya Krishna Menon, Ankit Singh Rawat Felix Yu, Seungyeon Kim, Andreas Veit, Sanjiv Kumar

Knowledge distillation is an approach to improve the performance of a student model by using the knowledge of a complex teacher. Despite its success in several deep learning applications, the study of distillation is mostly confined to classification settings.

Document Ranking Knowledge Distillation

On the Reproducibility of Neural Network Predictions

no code implementations5 Feb 2021 Srinadh Bhojanapalli, Kimberly Wilber, Andreas Veit, Ankit Singh Rawat, Seungyeon Kim, Aditya Menon, Sanjiv Kumar

By analyzing the relationship between churn and prediction confidences, we pursue an approach with two components for churn reduction.

Data Augmentation Image Classification

Semantic Label Smoothing for Sequence to Sequence Problems

no code implementations EMNLP 2020 Michal Lukasik, Himanshu Jain, Aditya Krishna Menon, Seungyeon Kim, Srinadh Bhojanapalli, Felix Yu, Sanjiv Kumar

Label smoothing has been shown to be an effective regularization strategy in classification, that prevents overfitting and helps in label de-noising.

Machine Translation Translation

Why distillation helps: a statistical perspective

no code implementations21 May 2020 Aditya Krishna Menon, Ankit Singh Rawat, Sashank J. Reddi, Seungyeon Kim, Sanjiv Kumar

In this paper, we present a statistical perspective on distillation which addresses this question, and provides a novel connection to extreme multiclass retrieval techniques.

Knowledge Distillation Retrieval

Why are Adaptive Methods Good for Attention Models?

no code implementations NeurIPS 2020 Jingzhao Zhang, Sai Praneeth Karimireddy, Andreas Veit, Seungyeon Kim, Sashank J. Reddi, Sanjiv Kumar, Suvrit Sra

While stochastic gradient descent (SGD) is still the \emph{de facto} algorithm in deep learning, adaptive methods like Clipped SGD/Adam have been observed to outperform SGD across important tasks, such as attention models.

Why ADAM Beats SGD for Attention Models

no code implementations25 Sep 2019 Jingzhao Zhang, Sai Praneeth Karimireddy, Andreas Veit, Seungyeon Kim, Sashank J Reddi, Sanjiv Kumar, Suvrit Sra

While stochastic gradient descent (SGD) is still the de facto algorithm in deep learning, adaptive methods like Adam have been observed to outperform SGD across important tasks, such as attention models.

Automatic Feature Induction for Stagewise Collaborative Filtering

no code implementations NeurIPS 2012 Joonseok Lee, Mingxuan Sun, Seungyeon Kim, Guy Lebanon

Recent approaches to collaborative filtering have concentrated on estimating an algebraic or statistical model, and using the model for predicting missing ratings.

Collaborative Filtering

Beyond Sentiment: The Manifold of Human Emotions

no code implementations8 Feb 2012 Seungyeon Kim, Fuxin Li, Guy Lebanon, Irfan Essa

Sentiment analysis predicts the presence of positive or negative emotions in a text document.

Sentiment Analysis

Local Space-Time Smoothing for Version Controlled Documents

no code implementations6 Mar 2010 Seungyeon Kim, Guy Lebanon

Unlike static documents, version controlled documents are continuously edited by one or more authors.

Cannot find the paper you are looking for? You can Submit a new open access paper.