Search Results for author: Kihyuk Sohn

Found 61 papers, 26 papers with code

DreamFlow: High-Quality Text-to-3D Generation by Approximating Probability Flow

no code implementations • 22 Mar 2024 • Kyungmin Lee, Kihyuk Sohn, Jinwoo Shin

Recent progress in text-to-3D generation has been achieved through the utilization of score distillation methods: they make use of the pre-trained text-to-image (T2I) diffusion models by distilling via the diffusion model training objective.

3D Generation Image-to-Image Translation +1

Paper
Add Code

Direct Consistency Optimization for Compositional Text-to-Image Personalization

no code implementations • 19 Feb 2024 • Kyungmin Lee, Sangkyung Kwak, Kihyuk Sohn, Jinwoo Shin

In particular, our method results in a superior Pareto frontier to the baselines.

Paper
Add Code

Unsupervised LLM Adaptation for Question Answering

no code implementations • 16 Feb 2024 • Kuniaki Saito, Kihyuk Sohn, Chen-Yu Lee, Yoshitaka Ushiku

In this task, we leverage a pre-trained LLM, a publicly available QA dataset (source data), and unlabeled documents from the target domain.

Question Answering

Paper
Add Code

Instruct-Imagen: Image Generation with Multi-modal Instruction

no code implementations • 3 Jan 2024 • Hexiang Hu, Kelvin C. K. Chan, Yu-Chuan Su, Wenhu Chen, Yandong Li, Kihyuk Sohn, Yang Zhao, Xue Ben, Boqing Gong, William Cohen, Ming-Wei Chang, Xuhui Jia

We introduce *multi-modal instruction* for image generation, a task representation articulating a range of generation intents with precision.

Image Generation Retrieval

Paper
Add Code

VideoPoet: A Large Language Model for Zero-Shot Video Generation

no code implementations • 21 Dec 2023 • Dan Kondratyuk, Lijun Yu, Xiuye Gu, José Lezama, Jonathan Huang, Grant Schindler, Rachel Hornung, Vighnesh Birodkar, Jimmy Yan, Ming-Chang Chiu, Krishna Somandepalli, Hassan Akbari, Yair Alon, Yong Cheng, Josh Dillon, Agrim Gupta, Meera Hahn, Anja Hauth, David Hendon, Alonso Martinez, David Minnen, Mikhail Sirotenko, Kihyuk Sohn, Xuan Yang, Hartwig Adam, Ming-Hsuan Yang, Irfan Essa, Huisheng Wang, David A. Ross, Bryan Seybold, Lu Jiang

We present VideoPoet, a language model capable of synthesizing high-quality video, with matching audio, from a large variety of conditioning signals.

Ranked #3 on Text-to-Video Generation on MSR-VTT

Language Modelling Large Language Model +2

Paper
Add Code

Photorealistic Video Generation with Diffusion Models

no code implementations • 11 Dec 2023 • Agrim Gupta, Lijun Yu, Kihyuk Sohn, Xiuye Gu, Meera Hahn, Li Fei-Fei, Irfan Essa, Lu Jiang, José Lezama

We present W. A. L. T, a transformer-based approach for photorealistic video generation via diffusion modeling.

Ranked #1 on Video Prediction on Kinetics-600 12 frames, 64x64

Text-to-Video Generation Video Generation +1

Paper
Add Code

Improve Supervised Representation Learning with Masked Image Modeling

no code implementations • 1 Dec 2023 • KaiFeng Chen, Daniel Salz, Huiwen Chang, Kihyuk Sohn, Dilip Krishnan, Mojtaba Seyedhosseini

On K-Nearest-Neighbor image retrieval evaluation with ImageNet-1k, the same model outperforms the baseline by 1. 32%.

Image Retrieval Representation Learning +2

Paper
Add Code

Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation

no code implementations • 9 Oct 2023 • Lijun Yu, José Lezama, Nitesh B. Gundavarapu, Luca Versari, Kihyuk Sohn, David Minnen, Yong Cheng, Vighnesh Birodkar, Agrim Gupta, Xiuye Gu, Alexander G. Hauptmann, Boqing Gong, Ming-Hsuan Yang, Irfan Essa, David A. Ross, Lu Jiang

While Large Language Models (LLMs) are the dominant models for generative tasks in language, they do not perform as well as diffusion models on image and video generation.

Ranked #2 on Video Prediction on Kinetics-600 12 frames, 64x64

Action Recognition Image Generation +4

Paper
Add Code

Label Budget Allocation in Multi-Task Learning

no code implementations • 24 Aug 2023 • Ximeng Sun, Kihyuk Sohn, Kate Saenko, Clayton Mellina, Xiao Bian

How should the label budget (i. e. the amount of money spent on labeling) be allocated among different tasks to achieve optimal multi-task performance?

Multi-Task Learning

Paper
Add Code

Collaborative Score Distillation for Consistent Visual Synthesis

no code implementations • 4 Jul 2023 • Subin Kim, Kyungmin Lee, June Suk Choi, Jongheon Jeong, Kihyuk Sohn, Jinwoo Shin

Generative priors of large-scale text-to-image diffusion models enable a wide range of new generation and editing applications on diverse visual modalities.

Paper
Add Code

Learning Disentangled Prompts for Compositional Image Synthesis

no code implementations • 1 Jun 2023 • Kihyuk Sohn, Albert Shaw, Yuan Hao, Han Zhang, Luisa Polania, Huiwen Chang, Lu Jiang, Irfan Essa

We study domain-adaptive image synthesis, the problem of teaching pretrained image generative models a new style or concept from as few as one image to synthesize novel images, to better understand the compositional image synthesis.

Domain Adaptation Image Generation +1

Paper
Add Code

StyleDrop: Text-to-Image Generation in Any Style

3 code implementations • 1 Jun 2023 • Kihyuk Sohn, Nataniel Ruiz, Kimin Lee, Daniel Castro Chin, Irina Blok, Huiwen Chang, Jarred Barber, Lu Jiang, Glenn Entis, Yuanzhen Li, Yuan Hao, Irfan Essa, Michael Rubinstein, Dilip Krishnan

Pre-trained large text-to-image models synthesize impressive images with an appropriate use of text prompts.

Text-to-Image Generation

548

Paper
Code

FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction

no code implementations • 4 May 2023 • Chen-Yu Lee, Chun-Liang Li, Hao Zhang, Timothy Dozat, Vincent Perot, Guolong Su, Xiang Zhang, Kihyuk Sohn, Nikolai Glushnev, Renshen Wang, Joshua Ainslie, Shangbang Long, Siyang Qin, Yasuhisa Fujii, Nan Hua, Tomas Pfister

In FormNetV2, we introduce a centralized multimodal graph contrastive learning strategy to unify self-supervised pre-training for all modalities in one loss.

Contrastive Learning document understanding +1

Paper
Add Code

Video Probabilistic Diffusion Models in Projected Latent Space

1 code implementation • CVPR 2023 • Sihyun Yu, Kihyuk Sohn, Subin Kim, Jinwoo Shin

Specifically, PVDM is composed of two components: (a) an autoencoder that projects a given video as 2D-shaped latent vectors that factorize the complex cubic structure of video pixels and (b) a diffusion model architecture specialized for our new factorized latent space and the training/sampling procedure to synthesize videos of arbitrary length with a single model.

Video Generation

271

Paper
Code

MaskSketch: Unpaired Structure-guided Masked Image Generation

2 code implementations • CVPR 2023 • Dina Bashkirova, Jose Lezama, Kihyuk Sohn, Kate Saenko, Irfan Essa

We show that intermediate self-attention maps of a masked generative transformer encode important structural information of the input image, such as scene layout and object shape, and we propose a novel sampling method based on this observation to enable structure-guided generation.

Conditional Image Generation Image-to-Image Translation +2

27,836

Paper
Code

Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval

1 code implementation • CVPR 2023 • Kuniaki Saito, Kihyuk Sohn, Xiang Zhang, Chun-Liang Li, Chen-Yu Lee, Kate Saenko, Tomas Pfister

Existing methods rely on supervised learning of CIR models using labeled triplets consisting of the query image, text specification, and the target image.

Ranked #1 on Zero-shot Image Retrieval on ImageNet-R

Attribute Retrieval +2

138

Paper
Code

MAGVIT: Masked Generative Video Transformer

1 code implementation • CVPR 2023 • Lijun Yu, Yong Cheng, Kihyuk Sohn, José Lezama, Han Zhang, Huiwen Chang, Alexander G. Hauptmann, Ming-Hsuan Yang, Yuan Hao, Irfan Essa, Lu Jiang

We introduce the MAsked Generative VIdeo Transformer, MAGVIT, to tackle various video synthesis tasks with a single model.

Ranked #1 on Video Prediction on Something-Something V2

Multi-Task Learning Text-to-Video Generation +2

846

Paper
Code

SPADE: Semi-supervised Anomaly Detection under Distribution Mismatch

no code implementations • 30 Nov 2022 • Jinsung Yoon, Kihyuk Sohn, Chun-Liang Li, Sercan O. Arik, Tomas Pfister

Semi-supervised anomaly detection is a common problem, as often the datasets containing anomalies are partially labeled.

Semi-supervised Anomaly Detection Supervised Anomaly Detection

Paper
Add Code

Visual Prompt Tuning for Generative Transfer Learning

1 code implementation • CVPR 2023 • Kihyuk Sohn, Yuan Hao, José Lezama, Luisa Polania, Huiwen Chang, Han Zhang, Irfan Essa, Lu Jiang

We base our framework on state-of-the-art generative vision transformers that represent an image as a sequence of visual tokens to the autoregressive or non-autoregressive transformers.

Image Generation Transfer Learning +1

Paper
Code

Prefix Conditioning Unifies Language and Label Supervision

no code implementations • CVPR 2023 • Kuniaki Saito, Kihyuk Sohn, Xiang Zhang, Chun-Liang Li, Chen-Yu Lee, Kate Saenko, Tomas Pfister

In experiments, we show that this simple technique improves the performance in zero-shot image recognition accuracy and robustness to the image-level distribution shift.

Classification Contrastive Learning +2

Paper
Add Code

Federated Semi-Supervised Learning with Prototypical Networks

1 code implementation • 27 May 2022 • Woojung Kim, Keondo Park, Kihyuk Sohn, Raphael Shu, Hyung-Sin Kim

Compared to a FSSL approach based on weight sharing, the prototype-based inter-client knowledge sharing significantly reduces both communication and computation costs, enabling more frequent knowledge sharing between more clients for better accuracy.

Federated Learning

Paper
Code

Towards Group Robustness in the presence of Partial Group Labels

no code implementations • 10 Jan 2022 • Vishnu Suresh Lokhande, Kihyuk Sohn, Jinsung Yoon, Madeleine Udell, Chen-Yu Lee, Tomas Pfister

Such a requirement is impractical in situations where the data labeling efforts for minority or rare groups are significantly laborious or where the individuals comprising the dataset choose to conceal sensitive information.

Paper
Add Code

Anomaly Clustering: Grouping Images into Coherent Clusters of Anomaly Types

2 code implementations • 21 Dec 2021 • Kihyuk Sohn, Jinsung Yoon, Chun-Liang Li, Chen-Yu Lee, Tomas Pfister

We define a distance function between images, each of which is represented as a bag of embeddings, by the Euclidean distance between weighted averaged embeddings.

Anomaly Detection Clustering +2

Paper
Code

Invariant Learning with Partial Group Labels

no code implementations • 29 Sep 2021 • Vishnu Suresh Lokhande, Kihyuk Sohn, Jinsung Yoon, Madeleine Udell, Chen-Yu Lee, Tomas Pfister

Such a requirement is impractical in situations where the data labelling efforts for minority or rare groups is significantly laborious or where the individuals comprising the dataset choose to conceal sensitive information.

Paper
Add Code

Unifying Distribution Alignment as a Loss for Imbalanced Semi-supervised Learning

no code implementations • 29 Sep 2021 • Justin Lazarow, Kihyuk Sohn, Chun-Liang Li, Zizhao Zhang, Chen-Yu Lee, Tomas Pfister

While remarkable progress in imbalanced supervised learning has been made recently, less attention has been given to the setting of imbalanced semi-supervised learning (SSL) where not only is a few labeled data provided, but the underlying data distribution can be severely imbalanced.

Pseudo Label

Paper
Add Code

Object-aware Contrastive Learning for Debiased Scene Representation

1 code implementation • NeurIPS 2021 • Sangwoo Mo, Hyunwoo Kang, Kihyuk Sohn, Chun-Liang Li, Jinwoo Shin

Contrastive self-supervised learning has shown impressive results in learning visual representations from unlabeled images by enforcing invariance against different data augmentations.

Contrastive Learning Object +2

Paper
Code

Controlling Neural Networks with Rule Representations

1 code implementation • NeurIPS 2021 • Sungyong Seo, Sercan O. Arik, Jinsung Yoon, Xiang Zhang, Kihyuk Sohn, Tomas Pfister

The key aspect of DeepCTRL is that it does not require retraining to adapt the rule strength -- at inference, the user can adjust it based on the desired operation point on accuracy vs. rule verification ratio.

Decision Making

Paper
Code

Self-supervise, Refine, Repeat: Improving Unsupervised Anomaly Detection

no code implementations • 11 Jun 2021 • Jinsung Yoon, Kihyuk Sohn, Chun-Liang Li, Sercan O. Arik, Chen-Yu Lee, Tomas Pfister

We demonstrate our method on various unsupervised AD tasks with image and tabular data.

Classification One-Class Classification +3

Paper
Add Code

AdaMatch: A Unified Approach to Semi-Supervised Learning and Domain Adaptation

5 code implementations • ICLR 2022 • David Berthelot, Rebecca Roelofs, Kihyuk Sohn, Nicholas Carlini, Alex Kurakin

We extend semi-supervised learning to the problem of domain adaptation to learn significantly higher-accuracy models that train on one data distribution and test on a different one.

Semi-supervised Domain Adaptation Unsupervised Domain Adaptation

Paper
Code

CutPaste: Self-Supervised Learning for Anomaly Detection and Localization

2 code implementations • CVPR 2021 • Chun-Liang Li, Kihyuk Sohn, Jinsung Yoon, Tomas Pfister

We aim at constructing a high performance model for defect detection that detects unknown anomalous patterns of an image without anomalous data.

Ranked #55 on Anomaly Detection on MVTec AD

Data Augmentation Defect Detection +4

221

Paper
Code

CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning

1 code implementation • CVPR 2021 • Chen Wei, Kihyuk Sohn, Clayton Mellina, Alan Yuille, Fan Yang

Semi-supervised learning on class-imbalanced data, although a realistic problem, has been under studied.

Paper
Code

Assessing Post-Disaster Damage from Satellite Imagery using Semi-Supervised Learning Techniques

no code implementations • 24 Nov 2020 • Jihyeon Lee, Joseph Z. Xu, Kihyuk Sohn, Wenhan Lu, David Berthelot, Izzeddin Gur, Pranav Khaitan, Ke-Wei, Huang, Kyriacos Koupparis, Bernhard Kowatsch

To respond to disasters such as earthquakes, wildfires, and armed conflicts, humanitarian organizations require accurate and timely data in the form of damage assessments, which indicate what buildings and population centers have been most affected.

BIG-bench Machine Learning Disaster Response +1

Paper
Add Code

Learning and Evaluating Representations for Deep One-class Classification

1 code implementation • ICLR 2021 • Kihyuk Sohn, Chun-Liang Li, Jinsung Yoon, Minho Jin, Tomas Pfister

We first learn self-supervised representations from one-class data, and then build one-class classifiers on learned representations.

Ranked #7 on Anomaly Detection on One-class CIFAR-100

Classification Contrastive Learning +6

150

Paper
Code

i-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning

3 code implementations • ICLR 2021 • Kibok Lee, Yian Zhu, Kihyuk Sohn, Chun-Liang Li, Jinwoo Shin, Honglak Lee

Contrastive representation learning has shown to be effective to learn representations from unlabeled data.

Contrastive Learning Representation Learning

Paper
Code

Improving Face Recognition by Clustering Unlabeled Faces in the Wild

no code implementations • ECCV 2020 • Aruni RoyChowdhury, Xiang Yu, Kihyuk Sohn, Erik Learned-Miller, Manmohan Chandraker

While deep face recognition has benefited significantly from large-scale labeled data, current research is focused on leveraging unlabeled data to further boost performance, reducing the cost of human annotation.

Clustering Face Clustering +3

Paper
Add Code

A Simple Semi-Supervised Learning Framework for Object Detection

7 code implementations • 10 May 2020 • Kihyuk Sohn, Zizhao Zhang, Chun-Liang Li, Han Zhang, Chen-Yu Lee, Tomas Pfister

Semi-supervised learning (SSL) has a potential to improve the predictive performance of machine learning models using unlabeled data.

Ranked #13 on Semi-Supervised Object Detection on COCO 100% labeled data (using extra training data)

Data Augmentation Image Classification +4

398

Paper
Code

ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring

1 code implementation • ICLR 2020 • David Berthelot, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Kihyuk Sohn, Han Zhang, Colin Raffel

We improve the recently-proposed ``MixMatch semi-supervised learning algorithm by introducing two new techniques: distribution alignment and augmentation anchoring.

129

Paper
Code

Towards Universal Representation Learning for Deep Face Recognition

no code implementations • CVPR 2020 • Yichun Shi, Xiang Yu, Kihyuk Sohn, Manmohan Chandraker, Anil K. Jain

Recognizing wild faces is extremely hard as they appear with all kinds of variations.

Face Recognition Representation Learning

Paper
Add Code

FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence

27 code implementations • NeurIPS 2020 • Kihyuk Sohn, David Berthelot, Chun-Liang Li, Zizhao Zhang, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Han Zhang, Colin Raffel

Semi-supervised learning (SSL) provides an effective means of leveraging unlabeled data to improve a model's performance.

Ranked #3 on Semi-Supervised Image Classification on SVHN, 1000 labels

Pseudo Label Semi-Supervised Image Classification

1,057

Paper
Code

Adversarial Learning of Privacy-Preserving and Task-Oriented Representations

no code implementations • 22 Nov 2019 • Taihong Xiao, Yi-Hsuan Tsai, Kihyuk Sohn, Manmohan Chandraker, Ming-Hsuan Yang

For instance, there could be a potential privacy risk of machine learning systems via the model inversion attack, whose goal is to reconstruct the input data from the latent representation of deep networks.

Attribute BIG-bench Machine Learning +2

Paper
Add Code

ReMixMatch: Semi-Supervised Learning with Distribution Alignment and Augmentation Anchoring

3 code implementations • 21 Nov 2019 • David Berthelot, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Kihyuk Sohn, Han Zhang, Colin Raffel

Distribution alignment encourages the marginal distribution of predictions on unlabeled data to be close to the marginal distribution of ground-truth labels.

Ranked #1 on Semi-Supervised Image Classification on cifar10, 250 Labels

Semi-Supervised Image Classification

1,128

Paper
Code

Adaptation Across Extreme Variations using Unlabeled Domain Bridges

no code implementations • 5 Jun 2019 • Shuyang Dai, Kihyuk Sohn, Yi-Hsuan Tsai, Lawrence Carin, Manmohan Chandraker

We tackle an unsupervised domain adaptation problem for which the domain discrepancy between labeled source and unlabeled target domains is large, due to many factors of inter and intra-domain variation.

Object Recognition Semantic Segmentation +1

Paper
Add Code

Domain Adaptation for Structured Output via Disentangled Patch Representations

no code implementations • ICLR 2019 • Yi-Hsuan Tsai, Kihyuk Sohn, Samuel Schulter, Manmohan Chandraker

To this end, we propose to learn discriminative feature representations of patches based on label histograms in the source domain, through the construction of a disentangled space.

Domain Adaptation Semantic Segmentation

Paper
Add Code

Unsupervised Domain Adaptation for Distance Metric Learning

no code implementations • ICLR 2019 • Kihyuk Sohn, Wenling Shang, Xiang Yu, Manmohan Chandraker

Unsupervised domain adaptation is a promising avenue to enhance the performance of deep neural networks on a target domain, using labels only from a source domain.

Face Recognition Metric Learning +1

Paper
Add Code

Active Adversarial Domain Adaptation

no code implementations • 16 Apr 2019 • Jong-Chyi Su, Yi-Hsuan Tsai, Kihyuk Sohn, Buyu Liu, Subhransu Maji, Manmohan Chandraker

Our approach, active adversarial domain adaptation (AADA), explores a duality between two related problems: adversarial domain alignment and importance sampling for adapting models across domains.

Active Learning Domain Adaptation +3

Paper
Add Code

Domain Adaptation for Structured Output via Discriminative Patch Representations

8 code implementations • ICCV 2019 • Yi-Hsuan Tsai, Kihyuk Sohn, Samuel Schulter, Manmohan Chandraker

Predicting structured outputs such as semantic segmentation relies on expensive per-pixel annotations to learn supervised models like convolutional neural networks.

Ranked #22 on Image-to-Image Translation on SYNTHIA-to-Cityscapes

Domain Adaptation Segmentation +2

839

Paper
Code

Learning Gibbs-regularized GANs with variational discriminator reparameterization

no code implementations • 27 Sep 2018 • Nicholas Rhinehart, Anqi Liu, Kihyuk Sohn, Paul Vernaza

We propose a novel approach to regularizing generative adversarial networks (GANs) leveraging learned {\em structured Gibbs distributions}.

Trajectory Forecasting

Paper
Add Code

Feature Transfer Learning for Deep Face Recognition with Under-Represented Data

no code implementations • 23 Mar 2018 • Xi Yin, Xiang Yu, Kihyuk Sohn, Xiaoming Liu, Manmohan Chandraker

In this paper, we propose a center-based feature transfer framework to augment the feature space of under-represented subjects from the regular subjects that have sufficiently diverse samples.

Disentanglement Face Recognition +1

Paper
Add Code

Learning to Adapt Structured Output Space for Semantic Segmentation

12 code implementations • CVPR 2018 • Yi-Hsuan Tsai, Wei-Chih Hung, Samuel Schulter, Kihyuk Sohn, Ming-Hsuan Yang, Manmohan Chandraker

In this paper, we propose an adversarial learning method for domain adaptation in the context of semantic segmentation.

Ranked #3 on Domain Adaptation on Synscapes-to-Cityscapes

Domain Adaptation Segmentation +2

839

Paper
Code

Gotta Adapt 'Em All: Joint Pixel and Feature-Level Domain Adaptation for Recognition in the Wild

1 code implementation • CVPR 2019 • Luan Tran, Kihyuk Sohn, Xiang Yu, Xiaoming Liu, Manmohan Chandraker

Recent developments in deep domain adaptation have allowed knowledge transfer from a labeled source domain to an unlabeled target domain at the level of intermediate features or input pixels.

Attribute Domain Adaptation +2

Paper
Code

Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos

no code implementations • ICCV 2017 • Kihyuk Sohn, Sifei Liu, Guangyu Zhong, Xiang Yu, Ming-Hsuan Yang, Manmohan Chandraker

Despite rapid advances in face recognition, there remains a clear gap between the performance of still image-based face recognition and video-based face recognition, due to the vast difference in visual quality between the domains and the difficulty of curating diverse large-scale video datasets.

Data Augmentation Face Recognition +1

Paper
Add Code

Channel-Recurrent Autoencoding for Image Modeling

no code implementations • 12 Jun 2017 • Wenling Shang, Kihyuk Sohn, Yuandong Tian

Despite recent successes in synthesizing faces and bedrooms, existing generative models struggle to capture more complex image types, potentially due to the oversimplification of their latent space constructions.

Paper
Add Code

Towards Large-Pose Face Frontalization in the Wild

no code implementations • ICCV 2017 • Xi Yin, Xiang Yu, Kihyuk Sohn, Xiaoming Liu, Manmohan Chandraker

Despite recent advances in face recognition using deep learning, severe accuracy drops are observed for large pose variations in unconstrained environments.

3D Reconstruction Face Recognition +1

Paper
Add Code

Reconstruction-Based Disentanglement for Pose-invariant Face Recognition

no code implementations • ICCV 2017 • Xi Peng, Xiang Yu, Kihyuk Sohn, Dimitris Metaxas, Manmohan Chandraker

Finally, we propose a new feature reconstruction metric learning to explicitly disentangle identity and pose, by demanding alignment between the feature reconstructions through various combinations of identity and pose features, which is obtained from two images of the same subject.

Disentanglement Face Recognition +2

Paper
Add Code

Improved Deep Metric Learning with Multi-class N-pair Loss Objective

1 code implementation • NeurIPS 2016 • Kihyuk Sohn

Deep metric learning has gained much popularity in recent years, following the success of deep learning.

Clustering Face Verification +3

5,254

Paper
Code

Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units

2 code implementations • 16 Mar 2016 • Wenling Shang, Kihyuk Sohn, Diogo Almeida, Honglak Lee

Recently, convolutional neural networks (CNNs) have been used as a powerful tool to solve many problems of machine learning and computer vision.

Paper
Code

Attribute2Image: Conditional Image Generation from Visual Attributes

1 code implementation • 2 Dec 2015 • Xinchen Yan, Jimei Yang, Kihyuk Sohn, Honglak Lee

This paper investigates a novel problem of generating images from visual attributes.

Attribute Conditional Image Generation +1

Paper
Code

Learning Structured Output Representation using Deep Conditional Generative Models

1 code implementation • NeurIPS 2015 • Kihyuk Sohn, Honglak Lee, Xinchen Yan

The model is trained efficiently in the framework of stochastic gradient variational Bayes, and allows a fast prediction using stochastic feed-forward inference.

Ranked #1 on Structured Prediction on MNIST

Semantic Segmentation Structured Prediction

Paper
Code

Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction

no code implementations • CVPR 2015 • Yuting Zhang, Kihyuk Sohn, Ruben Villegas, Gang Pan, Honglak Lee

Object detection systems based on the deep convolutional neural network (CNN) have recently made ground- breaking advances on several object detection benchmarks.

Bayesian Optimization Object +3

Paper
Add Code

Improved Multimodal Deep Learning with Variation of Information

no code implementations • NeurIPS 2014 • Kihyuk Sohn, Wenling Shang, Honglak Lee

Deep learning has been successfully applied to multimodal representation learning problems, with a common strategy to learning joint representations that are shared across multiple modalities on top of layers of modality-specific networks.

Multimodal Deep Learning Representation Learning

Paper
Add Code

Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling

no code implementations • CVPR 2013 • Andrew Kae, Kihyuk Sohn, Honglak Lee, Erik Learned-Miller

Although the CRF is a good baseline labeler, we show how an RBM can be added to the architecture to provide a global shape bias that complements the local modeling provided by the CRF.

Attribute Superpixels

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.