Search Results for author: Minsu Cho

Found 101 papers, 48 papers with code

Learning SO(3)-Invariant Semantic Correspondence via Local Shape Transform

no code implementations • 17 Apr 2024 • Chunghyun Park, SeungWook Kim, Jaesik Park, Minsu Cho

Establishing accurate 3D correspondences between shapes stands as a pivotal challenge with profound implications for computer vision and robotics.

Semantic correspondence

Paper
Add Code

Enhancing 3D Fidelity of Text-to-3D using Cross-View Correspondences

no code implementations • 16 Apr 2024 • SeungWook Kim, Kejie Li, Xueqing Deng, Yichun Shi, Minsu Cho, Peng Wang

Leveraging multi-view diffusion models as priors for 3D optimization have alleviated the problem of 3D consistency, e. g., the Janus face problem or the content drift problem, in zero-shot text-to-3D models.

Common Sense Reasoning Text to 3D

Paper
Add Code

Contrastive Mean-Shift Learning for Generalized Category Discovery

no code implementations • 15 Apr 2024 • Sua Choi, Dahyun Kang, Minsu Cho

We address the problem of generalized category discovery (GCD) that aims to partition a partially labeled collection of images; only a small part of the collection is labeled and the total number of target classes is unknown.

Clustering Contrastive Learning +1

Paper
Add Code

MoReVQA: Exploring Modular Reasoning Models for Video Question Answering

no code implementations • 9 Apr 2024 • Juhong Min, Shyamal Buch, Arsha Nagrani, Minsu Cho, Cordelia Schmid

This paper addresses the task of video question answering (videoQA) via a decomposed multi-stage, modular reasoning framework.

Question Answering Video Question Answering

Paper
Add Code

Learning Correlation Structures for Vision Transformers

no code implementations • 5 Apr 2024 • Manjin Kim, Paul Hongsuck Seo, Cordelia Schmid, Minsu Cho

We introduce a new attention mechanism, dubbed structural self-attention (StructSA), that leverages rich correlation patterns naturally emerging in key-query interactions of attention.

Ranked #4 on Action Recognition on Diving-48

Action Classification Action Recognition +2

Paper
Add Code

NVS-Adapter: Plug-and-Play Novel View Synthesis from a Single Image

no code implementations • 12 Dec 2023 • Yoonwoo Jeong, Jinwoo Lee, Chiheon Kim, Minsu Cho, Doyup Lee

Transfer learning of large-scale Text-to-Image (T2I) models has recently shown impressive potential for Novel View Synthesis (NVS) of diverse objects from a single image.

Novel View Synthesis Transfer Learning

Paper
Add Code

Activity Grammars for Temporal Action Segmentation

1 code implementation • NeurIPS 2023 • Dayoung Gong, Joonseok Lee, Deunsol Jung, Suha Kwak, Minsu Cho

Sequence prediction on temporal data requires the ability to understand compositional structures of multi-level semantics beyond individual and contextual properties.

Action Segmentation Segmentation

Paper
Code

Towards More Practical Group Activity Detection: A New Benchmark and Model

no code implementations • 5 Dec 2023 • Dongkeun Kim, Youngkil Song, Minsu Cho, Suha Kwak

Group activity detection (GAD) is the task of identifying members of each group and classifying the activity of the group at the same time in a video.

Action Detection Activity Detection

Paper
Add Code

Efficient Semantic Matching with Hypercolumn Correlation

no code implementations • 7 Nov 2023 • SeungWook Kim, Juhong Min, Minsu Cho

Recent studies show that leveraging the match-wise relationships within the 4D correlation map yields significant improvements in establishing semantic correspondences - but at the cost of increased computation and latency.

Paper
Add Code

Generalized Neural Sorting Networks with Error-Free Differentiable Swap Functions

no code implementations • 11 Oct 2023 • Jungtaek Kim, Jeongbeen Yoon, Minsu Cho

Sorting is a fundamental operation of all computer systems, having been a long-standing significant research topic.

Paper
Add Code

PriViT: Vision Transformers for Fast Private Inference

1 code implementation • 6 Oct 2023 • Naren Dhyani, Jianqiao Mo, Minsu Cho, Ameya Joshi, Siddharth Garg, Brandon Reagen, Chinmay Hegde

The Vision Transformer (ViT) architecture has emerged as the backbone of choice for state-of-the-art deep models for computer vision applications.

Image Classification

Paper
Code

Distilling Self-Supervised Vision Transformers for Weakly-Supervised Few-Shot Classification & Segmentation

no code implementations • CVPR 2023 • Dahyun Kang, Piotr Koniusz, Minsu Cho, Naila Murray

For this mixed setup, we propose to improve the pseudo-labels using a pseudo-label enhancer that was trained using the available ground-truth pixel-level labels.

Few-Shot Image Classification Pseudo Label +1

Paper
Add Code

Stable and Consistent Prediction of 3D Characteristic Orientation via Invariant Residual Learning

no code implementations • 20 Jun 2023 • SeungWook Kim, Chunghyun Park, Yoonwoo Jeong, Jaesik Park, Minsu Cho

Learning to predict reliable characteristic orientations of 3D point clouds is an important yet challenging problem, as different point clouds of the same class may have largely varying appearances.

Paper
Add Code

Relational Context Learning for Human-Object Interaction Detection

1 code implementation • CVPR 2023 • Sanghyun Kim, Deunsol Jung, Minsu Cho

Recent state-of-the-art methods for HOI detection typically build on transformer architectures with two decoder branches, one for human-object pair detection and the other for interaction classification.

Ranked #2 on Human-Object Interaction Detection on V-COCO

Human-Object Interaction Detection Object +2

Paper
Code

Devil's on the Edges: Selective Quad Attention for Scene Graph Generation

no code implementations • CVPR 2023 • Deunsol Jung, Sanghyun Kim, Won Hwa Kim, Minsu Cho

The edge selection module selects relevant object pairs, i. e., edges in the scene graph, which helps contextual reasoning, and the quad attention module then updates the edge features using both edge-to-node and edge-to-edge cross-attentions to capture contextual information between objects and object pairs.

Graph Generation Object +1

Paper
Add Code

Learning Rotation-Equivariant Features for Visual Correspondence

no code implementations • CVPR 2023 • Jongmin Lee, Byungjin Kim, SeungWook Kim, Minsu Cho

The resultant features and their orientations are further processed by group aligning, a novel invariant mapping technique that shifts the group-equivariant features by their orientations along the group dimension.

Pose Estimation Self-Supervised Learning

Paper
Add Code

Generalizable Implicit Neural Representations via Instance Pattern Composers

1 code implementation • CVPR 2023 • Chiheon Kim, Doyup Lee, Saehoon Kim, Minsu Cho, Wook-Shin Han

Despite recent advances in implicit neural representations (INRs), it remains challenging for a coordinate-based multi-layer perceptron (MLP) of INRs to learn a common representation across data instances and generalize it for unseen instances.

Meta-Learning

Paper
Code

Few-shot Metric Learning: Online Adaptation of Embedding for Retrieval

no code implementations • 14 Nov 2022 • Deunsol Jung, Dahyun Kang, Suha Kwak, Minsu Cho

Metric learning aims to build a distance metric typically by learning an effective embedding function that maps similar objects into nearby points in its embedding space.

Image Retrieval Meta-Learning +2

Paper
Add Code

Soft-Landing Strategy for Alleviating the Task Discrepancy Problem in Temporal Action Localization Tasks

1 code implementation • CVPR 2023 • Hyolim Kang, Hanjung Kim, Joungbin An, Minsu Cho, Seon Joo Kim

Temporal Action Localization (TAL) methods typically operate on top of feature sequences from a frozen snippet encoder that is pretrained with the Trimmed Action Classification (TAC) tasks, resulting in a task discrepancy problem.

Action Classification Computational Efficiency +1

Paper
Code

Sequential Brick Assembly with Efficient Constraint Satisfaction

no code implementations • 3 Oct 2022 • Seokjun Ahn, Jungtaek Kim, Minsu Cho, Jaesik Park

The assembly problem is challenging since the number of possible structures increases exponentially with the number of available bricks, complicating the physical constraints to satisfy across bricks.

Bayesian Optimization Position

Paper
Add Code

PeRFception: Perception using Radiance Fields

1 code implementation • 24 Aug 2022 • Yoonwoo Jeong, Seungjoo Shin, Junha Lee, Christopher Choy, Animashree Anandkumar, Minsu Cho, Jaesik Park

The recent progress in implicit 3D representation, i. e., Neural Radiance Fields (NeRFs), has made accurate and photorealistic 3D reconstruction possible in a differentiable manner.

3D Reconstruction Segmentation

327

Paper
Code

Towards Sequence-Level Training for Visual Tracking

2 code implementations • 11 Aug 2022 • Minji Kim, Seungkwan Lee, Jungseul Ok, Bohyung Han, Minsu Cho

Despite the extensive adoption of machine learning on the task of visual object tracking, recent learning-based approaches have largely overlooked the fact that visual tracking is a sequence-level task in its nature; they rely heavily on frame-level training, which inevitably induces inconsistency between training and testing in terms of both data distributions and task objectives.

Ranked #16 on Visual Object Tracking on TrackingNet

Data Augmentation Reinforcement Learning (RL) +1

Paper
Code

Revisiting Self-Distillation

no code implementations • 17 Jun 2022 • Minh Pham, Minsu Cho, Ameya Joshi, Chinmay Hegde

We first show that even with a highly accurate teacher, self-distillation allows a student to surpass the teacher in all cases.

Knowledge Distillation Model Compression

Paper
Add Code

Self-Supervised Learning of Image Scale and Orientation

1 code implementation • 15 Jun 2022 • Jongmin Lee, Yoonwoo Jeong, Minsu Cho

We study the problem of learning to assign a characteristic pose, i. e., scale and orientation, for an image region of interest.

Pose Estimation Self-Supervised Learning

Paper
Code

Peripheral Vision Transformer

1 code implementation • 14 Jun 2022 • Juhong Min, Yucheng Zhao, Chong Luo, Minsu Cho

We propose to incorporate peripheral position encoding to the multi-head self-attention layers to let the network learn to partition the visual field into diverse peripheral regions given training data.

Image Classification

Paper
Code

Draft-and-Revise: Effective Image Generation with Contextual RQ-Transformer

no code implementations • 9 Jun 2022 • Doyup Lee, Chiheon Kim, Saehoon Kim, Minsu Cho, Wook-Shin Han

After code stacks in the sequence are randomly masked, Contextual RQ-Transformer is trained to infill the masked code stacks based on the unmasked contexts of the image.

Ranked #1 on Text-to-Image Generation on Conceptual Captions

Conditional Image Generation Text-to-Image Generation

Paper
Add Code

Future Transformer for Long-term Action Anticipation

no code implementations • CVPR 2022 • Dayoung Gong, Joonseok Lee, Manjin Kim, Seong Jong Ha, Minsu Cho

The task of predicting future actions from a video is crucial for a real-world agent interacting with others.

Action Anticipation Long Term Action Anticipation +1

Paper
Add Code

Learning to Assemble Geometric Shapes

1 code implementation • 24 May 2022 • Jinhwi Lee, Jungtaek Kim, Hyunsoo Chung, Jaesik Park, Minsu Cho

Assembling parts into an object is a combinatorial problem that arises in a variety of contexts in the real world and involves numerous applications in science and engineering.

Paper
Code

TransforMatcher: Match-to-Match Attention for Semantic Correspondence

1 code implementation • CVPR 2022 • SeungWook Kim, Juhong Min, Minsu Cho

Establishing correspondences between images remains a challenging task, especially under large appearance changes due to different viewpoints or intra-class variations.

Ranked #10 on Semantic correspondence on SPair-71k

Semantic correspondence

Paper
Code

Smooth-Reduce: Leveraging Patches for Improved Certified Robustness

no code implementations • 12 May 2022 • Ameya Joshi, Minh Pham, Minsu Cho, Leonid Boytsov, Filipe Condessa, J. Zico Kolter, Chinmay Hegde

Randomized smoothing (RS) has been shown to be a fast, scalable technique for certifying the robustness of deep neural network classifiers.

Paper
Add Code

Self-Taught Metric Learning without Labels

no code implementations • CVPR 2022 • Sungyeon Kim, Dongwon Kim, Minsu Cho, Suha Kwak

At the heart of our framework lies an algorithm that investigates contexts of data on the embedding space to predict their class-equivalence relations as pseudo labels.

Metric Learning

Paper
Add Code

Self-Supervised Equivariant Learning for Oriented Keypoint Detection

1 code implementation • CVPR 2022 • Jongmin Lee, Byungjin Kim, Minsu Cho

Detecting robust keypoints from an image is an integral part of many computer vision problems, and the characteristic orientation and scale of keypoints play an important role for keypoint description and matching.

Keypoint Detection Self-Supervised Learning +1

Paper
Code

Detector-Free Weakly Supervised Group Activity Recognition

no code implementations • CVPR 2022 • Dongkeun Kim, Jinsung Lee, Minsu Cho, Suha Kwak

Group activity recognition is the task of understanding the activity conducted by a group of people as a whole in a multi-person video.

Group Activity Recognition

Paper
Add Code

Reflection and Rotation Symmetry Detection via Equivariant Learning

1 code implementation • CVPR 2022 • Ahyun Seo, Byungjin Kim, Suha Kwak, Minsu Cho

The inherent challenge of detecting symmetries stems from arbitrary orientations of symmetry patterns; a reflection symmetry mirrors itself against an axis with a specific orientation while a rotation symmetry matches its rotated copy with a specific orientation.

Symmetry Detection

Paper
Code

Integrative Few-Shot Learning for Classification and Segmentation

1 code implementation • CVPR 2022 • Dahyun Kang, Minsu Cho

We introduce the integrative task of few-shot classification and segmentation (FS-CS) that aims to both classify and segment target objects in a query image when the target classes are given with a few examples.

Ranked #1 on Few-Shot Classification and Segmentation on PASCAL-5i (1-way 1-shot)

Classification Few-Shot Classification and Segmentation +3

119

Paper
Code

Autoregressive Image Generation using Residual Quantization

3 code implementations • CVPR 2022 • Doyup Lee, Chiheon Kim, Saehoon Kim, Minsu Cho, Wook-Shin Han

However, we postulate that previous VQ cannot shorten the code sequence and generate high-fidelity images together in terms of the rate-distortion trade-off.

Ranked #2 on Text-to-Image Generation on Conceptual Captions

Conditional Image Generation Quantization +1

685

Paper
Code

Selective Network Linearization for Efficient Private Inference

1 code implementation • 4 Feb 2022 • Minsu Cho, Ameya Joshi, Siddharth Garg, Brandon Reagen, Chinmay Hegde

To reduce PI latency we propose a gradient-based algorithm that selectively linearizes ReLUs while maintaining prediction accuracy.

Paper
Code

Contrastive Regularization for Semi-Supervised Learning

no code implementations • 17 Jan 2022 • Doyup Lee, Sungwoong Kim, Ildoo Kim, Yeongjae Cheon, Minsu Cho, Wook-Shin Han

Consistency regularization on label predictions becomes a fundamental technique in semi-supervised learning, but it still requires a large number of training iterations for high performance.

Ranked #4 on Semi-Supervised Image Classification on cifar-100, 10000 Labels

Semi-Supervised Image Classification

Paper
Add Code

Fast Point Transformer

1 code implementation • CVPR 2022 • Chunghyun Park, Yoonwoo Jeong, Minsu Cho, Jaesik Park

The recent success of neural networks enables a better interpretation of 3D point clouds, but processing a large-scale 3D scene remains a challenging problem.

Ranked #24 on Semantic Segmentation on S3DIS

3D Semantic Segmentation Computational Efficiency +1

254

Paper
Code

Semi-supervised Domain Adaptation via Sample-to-Sample Self-Distillation

1 code implementation • 29 Nov 2021 • Jeongbeen Yoon, Dahyun Kang, Minsu Cho

Semi-supervised domain adaptation (SSDA) is to adapt a learner to a new domain with only a small set of labeled samples when a large labeled dataset is given on a source domain.

Domain Adaptation Semi-supervised Domain Adaptation

Paper
Code

Relational Self-Attention: What's Missing in Attention for Video Understanding

1 code implementation • NeurIPS 2021 • Manjin Kim, Heeseung Kwon, Chunyu Wang, Suha Kwak, Minsu Cho

Convolution has been arguably the most important feature transform for modern neural networks, leading to the advance of deep learning.

Ranked #11 on Action Recognition on Diving-48

Action Recognition Temporal Action Localization +1

Paper
Code

Rebooting ACGAN: Auxiliary Classifier GANs with Stable Training

1 code implementation • NeurIPS 2021 • Minguk Kang, Woohyeon Shim, Minsu Cho, Jaesik Park

On this foundation, we propose the Rebooted Auxiliary Classifier Generative Adversarial Network (ReACGAN).

Ranked #1 on Image Generation on CIFAR-10 (NFE metric)

Conditional Image Generation Generative Adversarial Network

3,362

Paper
Code

Brick-by-Brick: Combinatorial Construction with Deep Reinforcement Learning

no code implementations • NeurIPS 2021 • Hyunsoo Chung, Jungtaek Kim, Boris Knyazev, Jinhwi Lee, Graham W. Taylor, Jaesik Park, Minsu Cho

Discovering a solution in a combinatorial space is prevalent in many real-world problems but it is also challenging due to diverse complex constraints and the vast number of possible combinations.

Object reinforcement-learning +1

Paper
Add Code

Differentiable Spline Approximations

no code implementations • NeurIPS 2021 • Minsu Cho, Aditya Balu, Ameya Joshi, Anjana Deva Prasad, Biswajit Khara, Soumik Sarkar, Baskar Ganapathysubramanian, Adarsh Krishnamurthy, Chinmay Hegde

Overall, we show that leveraging this redesigned Jacobian in the form of a differentiable "layer" in predictive models leads to improved performance in diverse applications such as image segmentation, 3D point cloud reconstruction, and finite element analysis.

3D Point Cloud Reconstruction BIG-bench Machine Learning +3

Paper
Add Code

Efficient Point Transformer for Large-scale 3D Scene Understanding

no code implementations • 29 Sep 2021 • Chunghyun Park, Yoonwoo Jeong, Minsu Cho, Jaesik Park

Although sparse convolution is efficient and scalable for large 3D scenes, the quantization artifacts impair geometric details and degrade prediction accuracy.

3D Semantic Segmentation Quantization +1

Paper
Add Code

Visual TransforMatcher: Efficient Match-to-Match Attention for Visual Correspondence

no code implementations • 29 Sep 2021 • Seung Wook Kim, Juhong Min, Minsu Cho

Establishing correspondences between images remains a challenging task, especially under large appearance changes due to different viewpoints and intra-class variations.

Paper
Add Code

Rotation-Equivariant Keypoint Detection

no code implementations • 29 Sep 2021 • Jongmin Lee, Byungjin Kim, Minsu Cho

Therefore, we propose a rotation-invariant keypoint detection method using rotation-equivariant CNNs.

Keypoint Detection Translation

Paper
Add Code

Convolutional Hough Matching Networks for Robust and Efficient Visual Correspondence

1 code implementation • 11 Sep 2021 • Juhong Min, SeungWook Kim, Minsu Cho

To validate the proposed techniques, we develop the neural network with CHM layers that perform convolutional matching in the space of translation and scaling.

Geometric Matching Translation

Paper
Code

Deep Hough Voting for Robust Global Registration

no code implementations • ICCV 2021 • Junha Lee, SeungWook Kim, Minsu Cho, Jaesik Park

We then construct a set of triplets of correspondences to cast votes on the 6D Hough space, representing the transformation parameters in sparse tensors.

Point Cloud Registration

Paper
Add Code

Self-Calibrating Neural Radiance Fields

1 code implementation • ICCV 2021 • Yoonwoo Jeong, Seokjun Ahn, Christopher Choy, Animashree Anandkumar, Minsu Cho, Jaesik Park

We also propose a new geometric loss function, viz., projected ray distance loss, to incorporate geometric consistency for complex non-linear camera models.

457

Paper
Code

Learning to Discover Reflection Symmetry via Polar Matching Convolution

no code implementations • ICCV 2021 • Ahyun Seo, Woohyeon Shim, Minsu Cho

The task of reflection symmetry detection remains challenging due to significant variations and ambiguities of symmetry patterns in the wild.

Self-Supervised Learning Symmetry Detection

Paper
Add Code

Relational Embedding for Few-Shot Classification

1 code implementation • ICCV 2021 • Dahyun Kang, Heeseung Kwon, Juhong Min, Minsu Cho

We propose to address the problem of few-shot classification by meta-learning "what to observe" and "where to attend" in a relational perspective.

Ranked #15 on Few-Shot Image Classification on CUB 200 5-way 5-shot

Classification Few-Shot Image Classification +1

105

Paper
Code

Sphynx: ReLU-Efficient Network Design for Private Inference

no code implementations • 17 Jun 2021 • Minsu Cho, Zahra Ghodsi, Brandon Reagen, Siddharth Garg, Chinmay Hegde

The emergence of deep learning has been accompanied by privacy concerns surrounding users' data and service providers' models.

Paper
Add Code

Hypercorrelation Squeeze for Few-Shot Segmentation

1 code implementation • 4 Apr 2021 • Juhong Min, Dahyun Kang, Minsu Cho

Few-shot semantic segmentation aims at learning to segment a target object from a query image using only a few annotated support images of the target class.

Ranked #13 on Few-Shot Semantic Segmentation on FSS-1000 (5-shot)

Feature Correlation Few-Shot Semantic Segmentation +1

224

Paper
Code

Convolutional Hough Matching Networks

1 code implementation • CVPR 2021 • Juhong Min, Minsu Cho

Despite advances in feature representation, leveraging geometric relations is crucial for establishing reliable visual correspondences under large variations of images.

Ranked #4 on Semantic correspondence on PF-WILLOW

Geometric Matching Semantic correspondence +1

Paper
Code

Embedding Transfer with Label Relaxation for Improved Metric Learning

2 code implementations • CVPR 2021 • Sungyeon Kim, Dongwon Kim, Minsu Cho, Suha Kwak

Our method exploits pairwise similarities between samples in the source embedding space as the knowledge, and transfers them through a loss used for learning target embedding models.

Knowledge Distillation Metric Learning

306

Paper
Code

Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition

1 code implementation • ICCV 2021 • Heeseung Kwon, Manjin Kim, Suha Kwak, Minsu Cho

With a sufficient volume of the neighborhood in space and time, it effectively captures long-term interaction and fast motion in the video, leading to robust action recognition.

Ranked #18 on Action Recognition on Something-Something V1 (using extra training data)

Action Recognition Temporal Action Localization +1

Paper
Code

Learning Self-Similarity in Space and Time as a Generalized Motion for Action Recognition

1 code implementation • 1 Jan 2021 • Heeseung Kwon, Manjin Kim, Suha Kwak, Minsu Cho

We leverage the whole volume of STSS and let our model learn to extract an effective motion representation from it.

Action Recognition Video Understanding

Paper
Code

Differentiable Programming for Piecewise Polynomial Functions

no code implementations • NeurIPS Workshop LMCA 2020 • Minsu Cho, Ameya Joshi, Xian Yeow Lee, Aditya Balu, Adarsh Krishnamurthy, Baskar Ganapathysubramanian, Soumik Sarkar, Chinmay Hegde

The paradigm of differentiable programming has considerably enhanced the scope of machine learning via the judicious use of gradient-based optimization.

Denoising Image Segmentation +3

Paper
Add Code

Hypercorrelation Squeeze for Few-Shot Segmenation

no code implementations • ICCV 2021 • Juhong Min, Dahyun Kang, Minsu Cho

Few-shot semantic segmentation aims at learning to segment a target object from a query image using only a few annotated support images of the target class.

Feature Correlation Few-Shot Semantic Segmentation +2

Paper
Add Code

Embedding Transfer via Smooth Contrastive Loss

no code implementations • 1 Jan 2021 • Sungyeon Kim, Dongwon Kim, Minsu Cho, Suha Kwak

To this end, we design a new loss called smooth contrastive loss, which pulls together or pushes apart a pair of samples in a target embedding space with strength determined by their semantic similarity in the source embedding space; an analysis of the loss reveals that this property enables more important pairs to contribute more to learning the target embedding space.

Metric Learning Semantic Similarity +1

Paper
Add Code

Pair-based Self-Distillation for Semi-supervised Domain Adaptation

no code implementations • 1 Jan 2021 • Jeongbeen Yoon, Dahyun Kang, Minsu Cho

Semi-supervised domain adaptation (SSDA) is to adapt a learner to a new domain with only a small set of labeled samples when a large labeled dataset is given on a source domain.

Domain Adaptation Semi-supervised Domain Adaptation

Paper
Add Code

Combinatorial Bayesian Optimization with Random Mapping Functions to Convex Polytopes

no code implementations • 26 Nov 2020 • Jungtaek Kim, Seungjin Choi, Minsu Cho

The main idea is to use a random mapping which embeds the combinatorial space into a convex polytope in a continuous space, on which all essential process is performed to determine a solution to the black-box optimization in the combinatorial space.

Bayesian Optimization

Paper
Add Code

CircleGAN: Generative Adversarial Learning across Spherical Circles

1 code implementation • NeurIPS 2020 • Woohyeon Shim, Minsu Cho

We present a novel discriminator for GANs that improves realness and diversity of generated samples by learning a structured hypersphere embedding space using spherical circles.

Representation Learning

Paper
Code

Fragment Relation Networks for Geometric Shape Assembly

no code implementations • NeurIPS Workshop LMCA 2020 • Jinhwi Lee, Jungtaek Kim, Hyunsoo Chung, Jaesik Park, Minsu Cho

Our model processes the candidate fragments in a permutation-equivariant manner and can generalize to cases with an arbitrary number of fragments and even with a different target object.

Object Relation

Paper
Add Code

Diversified Mutual Learning for Deep Metric Learning

no code implementations • 9 Sep 2020 • Wonpyo Park, Wonjae Kim, Kihyun You, Minsu Cho

Mutual learning is an ensemble training strategy to improve generalization by transferring individual knowledge to each other while simultaneously training multiple models.

Metric Learning Transfer Learning

Paper
Add Code

Learning to Compose Hypercolumns for Visual Correspondence

1 code implementation • ECCV 2020 • Juhong Min, Jongmin Lee, Jean Ponce, Minsu Cho

Feature representation plays a crucial role in visual correspondence, and recent methods for image matching resort to deeply stacked convolutional layers.

Ranked #2 on Semantic correspondence on Caltech-101

object-detection Semantic correspondence

Paper
Code

MotionSqueeze: Neural Motion Feature Learning for Video Understanding

2 code implementations • ECCV 2020 • Heeseung Kwon, Manjin Kim, Suha Kwak, Minsu Cho

As the frame-by-frame optical flows require heavy computation, incorporating motion information has remained a major computational bottleneck for video understanding.

Ranked #1 on Video Classification on Something-Something V2

Action Classification Action Recognition +2

132

Paper
Code

IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos

2 code implementations • 13 Jul 2020 • Gyeongsik Moon, Heeseung Kwon, Kyoung Mu Lee, Minsu Cho

Most current action recognition methods heavily rely on appearance information by taking an RGB sequence of entire image regions as input.

Action Recognition In Videos Pose Estimation +1

Paper
Code

Hyperparameter Optimization in Neural Networks via Structured Sparse Recovery

no code implementations • 7 Jul 2020 • Minsu Cho, Mohammadreza Soltani, Chinmay Hegde

In this paper, we study two important problems in the automated design of neural networks -- Hyper-parameter Optimization (HPO), and Neural Architecture Search (NAS) -- through the lens of sparse recovery methods.

Hyperparameter Optimization Neural Architecture Search

Paper
Add Code

ESPN: Extremely Sparse Pruned Networks

1 code implementation • 28 Jun 2020 • Minsu Cho, Ameya Joshi, Chinmay Hegde

Deep neural networks are often highly overparameterized, prohibiting their use in compute-limited systems.

Network Pruning

Paper
Code

Local-Global Video-Text Interactions for Temporal Grounding

1 code implementation • CVPR 2020 • Jonghwan Mun, Minsu Cho, Bohyung Han

This paper addresses the problem of text-to-video temporal grounding, which aims to identify the time interval in a video semantically relevant to a text query.

126

Paper
Code

Combinatorial 3D Shape Generation via Sequential Assembly

3 code implementations • 16 Apr 2020 • Jungtaek Kim, Hyunsoo Chung, Jinhwi Lee, Minsu Cho, Jaesik Park

To alleviate this consequence induced by a huge number of feasible combinations, we propose a combinatorial 3D shape generation framework.

3D Shape Generation Bayesian Optimization

Paper
Code

Proxy Anchor Loss for Deep Metric Learning

3 code implementations • CVPR 2020 • Sungyeon Kim, Dongwon Kim, Minsu Cho, Suha Kwak

The former class can leverage fine-grained semantic relations between data points, but slows convergence in general due to its high training complexity.

Ranked #10 on Metric Learning on CUB-200-2011 (using extra training data)

Fine-Grained Image Classification Fine-Grained Vehicle Classification +1

306

Paper
Code

Freeze the Discriminator: a Simple Baseline for Fine-Tuning GANs

4 code implementations • 25 Feb 2020 • Sangwoo Mo, Minsu Cho, Jinwoo Shin

Generative adversarial networks (GANs) have shown outstanding performance on a wide range of problems in computer vision, graphics, and machine learning, but often require numerous training data and heavy computational resources.

Ranked #5 on 10-shot image generation on Babies

10-shot image generation Image Generation +1

284

Paper
Code

Real-Time Object Tracking via Meta-Learning: Efficient Model Adaptation and One-Shot Channel Pruning

no code implementations • 25 Nov 2019 • Ilchae Jung, Kihyun You, Hyeonwoo Noh, Minsu Cho, Bohyung Han

We propose a novel meta-learning framework for real-time object tracking with efficient model adaptation and channel pruning.

Meta-Learning Object +1

Paper
Add Code

Mining GOLD Samples for Conditional GANs

1 code implementation • NeurIPS 2019 • Sangwoo Mo, Chiheon Kim, Sungwoong Kim, Minsu Cho, Jinwoo Shin

Conditional generative adversarial networks (cGANs) have gained a considerable attention in recent years due to its class-wise controllability and superior quality for complex generation tasks.

Active Learning

Paper
Code

Regularizing Neural Networks via Stochastic Branch Layers

no code implementations • 3 Oct 2019 • Wonpyo Park, Paul Hongsuck Seo, Bohyung Han, Minsu Cho

We introduce a novel stochastic regularization technique for deep neural networks, which decomposes a layer into multiple branches with different parameters and merges stochastically sampled combinations of the outputs from the branches during training.

Paper
Add Code

SPair-71k: A Large-scale Benchmark for Semantic Correspondence

no code implementations • 28 Aug 2019 • Juhong Min, Jongmin Lee, Jean Ponce, Minsu Cho

In this paper, we present a new large-scale benchmark dataset of semantically paired images, SPair-71k, which contains 70, 958 image pairs with diverse variations in viewpoint and scale.

Semantic correspondence

Paper
Add Code

Hyperpixel Flow: Semantic Correspondence with Multi-layer Neural Features

1 code implementation • ICCV 2019 • Juhong Min, Jongmin Lee, Jean Ponce, Minsu Cho

Establishing visual correspondences under large intra-class variations requires analyzing images at different levels, from features linked to semantics and context to local patterns, while being invariant to instance-specific details.

Ranked #1 on Semantic correspondence on Caltech-101

Semantic correspondence

Paper
Code

One-Shot Neural Architecture Search via Compressive Sensing

1 code implementation • 7 Jun 2019 • Minsu Cho, Mohammadreza Soltani, Chinmay Hegde

Neural Architecture Search remains a very challenging meta-learning problem.

Compressive Sensing Meta-Learning +1

Paper
Code

Instance-aware Image-to-Image Translation

1 code implementation • ICLR 2019 • Sangwoo Mo, Minsu Cho, Jinwoo Shin

Unsupervised image-to-image translation has gained considerable attention due to the recent impressive progress based on generative adversarial networks (GANs).

Semantic Segmentation Translation +1

840

Paper
Code

Reducing The Search Space For Hyperparameter Optimization Using Group Sparsity

no code implementations • 24 Apr 2019 • Minsu Cho, Chinmay Hegde

We propose a new algorithm for hyperparameter selection in machine learning algorithms.

BIG-bench Machine Learning Hyperparameter Optimization

Paper
Add Code

Deep Metric Learning Beyond Binary Supervision

1 code implementation • CVPR 2019 • Sungyeon Kim, Minkyo Seo, Ivan Laptev, Minsu Cho, Suha Kwak

Metric Learning for visual similarity has mostly adopted binary supervision indicating whether a pair of images are of the same class or not.

Image Captioning Image Retrieval +4

Paper
Code

Universal Bounding Box Regression and Its Applications

no code implementations • 15 Apr 2019 • Seungkwan Lee, Suha Kwak, Minsu Cho

Bounding-box regression is a popular technique to refine or predict localization boxes in recent object detection approaches.

Object object-detection +3

Paper
Add Code

Relational Knowledge Distillation

3 code implementations • CVPR 2019 • Wonpyo Park, Dongju Kim, Yan Lu, Minsu Cho

Knowledge distillation aims at transferring knowledge acquired in one model (a teacher) to another model (a student) that is typically smaller.

Knowledge Distillation Metric Learning

1,265

Paper
Code

Unsupervised Image Matching and Object Discovery as Optimization

1 code implementation • CVPR 2019 • Huy V. Vo, Francis Bach, Minsu Cho, Kai Han, Yann Lecun, Patrick Perez, Jean Ponce

Learning with complete or partial supervision is powerful but relies on ever-growing human annotation efforts.

Ranked #2 on Single-object colocalization on Object Discovery

Object Object Discovery +2

Paper
Code

InstaGAN: Instance-aware Image-to-Image Translation

1 code implementation • 28 Dec 2018 • Sangwoo Mo, Minsu Cho, Jinwoo Shin

Our comparative evaluation demonstrates the effectiveness of the proposed method on different image datasets, in particular, in the aforementioned challenging cases.

Ranked #1 on Image-to-Image Translation on Object Transfiguration (sheep-to-giraffe)

Semantic Segmentation Translation +1

840

Paper
Code

Attentive Semantic Alignment with Offset-Aware Correlation Kernels

no code implementations • ECCV 2018 • Paul Hongsuck Seo, Jongmin Lee, Deunsol Jung, Bohyung Han, Minsu Cho

Semantic correspondence is the problem of establishing correspondences across images depicting different instances of the same object or scene class.

Semantic correspondence Translation

Paper
Add Code

Multi-Object Tracking With Quadruplet Convolutional Neural Networks

no code implementations • CVPR 2017 • Jeany Son, Mooyeol Baek, Minsu Cho, Bohyung Han

We propose Quadruplet Convolutional Neural Networks (Quad-CNN) for multi-object tracking, which learn to associate object detections across frames using quadruplet losses.

Multi-Object Tracking Object +1

Paper
Add Code

SCNet: Learning Semantic Correspondence

1 code implementation • ICCV 2017 • Kai Han, Rafael S. Rezende, Bumsub Ham, Kwan-Yee K. Wong, Minsu Cho, Cordelia Schmid, Jean Ponce

This paper addresses the problem of establishing semantic correspondences between images depicting different instances of the same object or scene category.

Semantic correspondence

Paper
Code

Proposal Flow: Semantic Correspondences from Object Proposals

no code implementations • 21 Mar 2017 • Bumsub Ham, Minsu Cho, Cordelia Schmid, Jean Ponce

Finding image correspondences remains a challenging problem in the presence of intra-class variations and large changes in scene layout.

Object

Paper
Add Code

Text-guided Attention Model for Image Captioning

1 code implementation • 12 Dec 2016 • Jonghwan Mun, Minsu Cho, Bohyung Han

Visual attention plays an important role to understand images and demonstrates its effectiveness in generating natural language descriptions of images.

Image Captioning

Paper
Code

ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization

1 code implementation • 14 Sep 2016 • Vadim Kantorov, Maxime Oquab, Minsu Cho, Ivan Laptev

The additive model encourages the predicted object region to be supported by its surrounding context region.

Ranked #4 on Weakly Supervised Object Detection on Charades

Object Object Recognition +2

Paper
Code

Thin-Slicing for Pose: Learning to Understand Pose Without Explicit Pose Estimation

no code implementations • CVPR 2016 • Suha Kwak, Minsu Cho, Ivan Laptev

We address the problem of learning a pose-aware, compact embedding that projects images with similar human poses to be placed close-by in the embedding space.

Action Recognition Image Retrieval +3

Paper
Add Code

Proposal Flow

no code implementations • CVPR 2016 • Bumsub Ham, Minsu Cho, Cordelia Schmid, Jean Ponce

Finding image correspondences remains a challenging problem in the presence of intra-class variations and large changes in scene layout.~Semantic flow methods are designed to handle images depicting different instances of the same object or scene category.

Object

Paper
Add Code

Robust Image Filtering Using Joint Static and Dynamic Guidance

no code implementations • CVPR 2015 • Bumsub Ham, Minsu Cho, Jean Ponce

Regularizing images under a guidance signal has been used in various tasks in computer vision and computational photography, particularly for noise reduction and joint upsampling.

Denoising Super-Resolution

Paper
Add Code

Unsupervised Object Discovery and Tracking in Video Collections

no code implementations • ICCV 2015 • Suha Kwak, Minsu Cho, Ivan Laptev, Jean Ponce, Cordelia Schmid

This paper addresses the problem of automatically localizing dominant objects as spatio-temporal tubes in a noisy collection of videos with minimal or even no supervision.

Object Object Discovery +1

Paper
Add Code

A General Multi-Graph Matching Approach via Graduated Consistency-regularized Boosting

no code implementations • 20 Feb 2015 • Junchi Yan, Minsu Cho, Hongyuan Zha, Xiaokang Yang, Stephen Chu

We propose multi-graph matching methods to incorporate the two aspects by boosting the affinity score, meanwhile gradually infusing the consistency as a regularizer.

Graph Matching

Paper
Add Code

Unsupervised Object Discovery and Localization in the Wild: Part-based Matching with Bottom-up Region Proposals

no code implementations • CVPR 2015 • Minsu Cho, Suha Kwak, Cordelia Schmid, Jean Ponce

This paper addresses unsupervised discovery and localization of dominant objects from a noisy image collection with multiple object classes.

Object Object Discovery

Paper
Add Code

Finding Matches in a Haystack: A Max-Pooling Strategy for Graph Matching in the Presence of Outliers

no code implementations • CVPR 2014 • Minsu Cho, Jian Sun, Olivier Duchenne, Jean Ponce

A major challenge in real-world feature matching problems is to tolerate the numerous outliers arising in typical visual tasks.

Graph Matching

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.