Search Results for author: Tae-Hyun Oh

Found 64 papers, 21 papers with code

Object-Centric Domain Randomization for 3D Shape Reconstruction in the Wild

no code implementations21 Mar 2024 Junhyeong Cho, Kim Youwang, Hunmin Yang, Tae-Hyun Oh

One of the biggest challenges in single-view 3D shape reconstruction in the wild is the scarcity of <3D shape, 2D image>-paired data from real-world environments.

3D Shape Reconstruction Object

Noise Map Guidance: Inversion with Spatial Context for Real Image Editing

1 code implementation7 Feb 2024 Hansam Cho, Jonghyun Lee, Seoung Bum Kim, Tae-Hyun Oh, Yonghyun Jeong

Text-guided diffusion models have become a popular tool in image synthesis, known for producing high-quality and diverse images.

Image Generation

FPRF: Feed-Forward Photorealistic Style Transfer of Large-Scale 3D Neural Radiance Fields

no code implementations10 Jan 2024 GeonU Kim, Kim Youwang, Tae-Hyun Oh

FPRF efficiently stylizes large-scale 3D scenes by introducing a style-decomposed 3D neural radiance field, which inherits AdaIN's feed-forward stylization machinery, supporting arbitrary style reference images.

Semantic correspondence Style Transfer

Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering

no code implementations18 Dec 2023 Kim Youwang, Tae-Hyun Oh, Gerard Pons-Moll

We present Paint-it, a text-driven high-fidelity texture map synthesis method for 3D meshes via neural re-parameterized texture optimization.

Texture Synthesis

Learning-based Axial Video Motion Magnification

no code implementations15 Dec 2023 Kwon Byung-Ki, Oh Hyun-Bin, Kim Jun-Seong, Hyunwoo Ha, Tae-Hyun Oh

In this work, we focus on improving legibility by proposing a new concept, axial motion magnification, which magnifies decomposed motions along the user-specified direction.

Motion Magnification

SMILE: Multimodal Dataset for Understanding Laughter in Video with Language Models

2 code implementations15 Dec 2023 Lee Hyun, Kim Sung-Bin, Seungju Han, Youngjae Yu, Tae-Hyun Oh

We introduce this new task to explain why people laugh in a particular video and a dataset for this task.

Video Understanding

A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optimization

no code implementations4 Oct 2023 Kim Youwang, Lee Hyun, Kim Sung-Bin, Suekyeong Nam, Janghoon Ju, Tae-Hyun Oh

We propose NeuFace, a 3D face mesh pseudo annotation method on videos via neural re-parameterized optimization.

3D Face Reconstruction

Sound Source Localization is All about Cross-Modal Alignment

no code implementations ICCV 2023 Arda Senocak, Hyeonggon Ryu, Junsik Kim, Tae-Hyun Oh, Hanspeter Pfister, Joon Son Chung

However, prior arts and existing benchmarks do not account for a more important aspect of the problem, cross-modal semantic understanding, which is essential for genuine sound source localization.

Cross-Modal Retrieval Retrieval

An Iterative Method for Unsupervised Robust Anomaly Detection Under Data Contamination

no code implementations18 Sep 2023 Minkyung Kim, Jongmin Yu, Junsik Kim, Tae-Hyun Oh, Jun Kyun Choi

Therefore, it has been a common practice to learn normality under the assumption that anomalous data are absent in a training dataset, which we call normality assumption.

One-Class Classification

The Devil in the Details: Simple and Effective Optical Flow Synthetic Data Generation

no code implementations14 Aug 2023 Kwon Byung-Ki, Kim Sung-Bin, Tae-Hyun Oh

Recent work on dense optical flow has shown significant progress, primarily in a supervised learning manner requiring a large amount of labeled data.

Optical Flow Estimation Synthetic Data Generation

SYNAuG: Exploiting Synthetic Data for Data Imbalance Problems

no code implementations2 Aug 2023 Moon Ye-Bin, Nam Hyeon-Woo, Wonseok Choi, Nayeong Kim, Suha Kwak, Tae-Hyun Oh

We live in an era of data floods, and deep neural networks play a pivotal role in this moment.

Fairness

TextManiA: Enriching Visual Feature by Text-driven Manifold Augmentation

no code implementations ICCV 2023 Moon Ye-Bin, Jisoo Kim, Hongyeob Kim, Kilho Son, Tae-Hyun Oh

Given the hypothesis, TextManiA transfers pre-trained text representation obtained from a well-established large language encoder to a target visual feature space being learned.

Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis

1 code implementation26 May 2023 Seongyeon Park, Bohyung Kim, Tae-Hyun Oh

With our framework, we show superior performance compared to baselines in zero-shot TTS and VC, achieving state-of-the-art performance.

Speech Synthesis

Prefix tuning for automated audio captioning

1 code implementation30 Mar 2023 Minkyu Kim, Kim Sung-Bin, Tae-Hyun Oh

Audio captioning aims to generate text descriptions from environmental sounds.

AudioCaps Audio captioning +2

Unsupervised Pre-Training For Data-Efficient Text-to-Speech On Low Resource Languages

1 code implementation28 Mar 2023 Seongyeon Park, Myungseo Song, Bohyung Kim, Tae-Hyun Oh

We empirically demonstrate the effectiveness of our proposed method in low-resource language scenarios, achieving outstanding performance compared to competing methods.

Data Augmentation Unsupervised Pre-training

ENInst: Enhancing Weakly-supervised Low-shot Instance Segmentation

no code implementations20 Feb 2023 Moon Ye-Bin, Dongmin Choi, Yongjin Kwon, Junsik Kim, Tae-Hyun Oh

We address a weakly-supervised low-shot instance segmentation, an annotation-efficient training method to deal with novel classes effectively.

Instance Segmentation Semantic Segmentation

DFlow: Learning to Synthesize Better Optical Flow Datasets via a Differentiable Pipeline

1 code implementation ICLR 2023 Kwon Byung-Ki, Nam Hyeon-Woo, Ji-Yun Kim, Tae-Hyun Oh

Comprehensive studies of synthetic optical flow datasets have attempted to reveal what properties lead to accuracy improvement in learning-based optical flow estimation.

Optical Flow Estimation

Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data

no code implementations26 Jan 2023 Dong-Jin Kim, Tae-Hyun Oh, Jinsoo Choi, In So Kweon

We present a novel data-efficient semi-supervised framework to improve the generalization of image captioning models.

Relational Captioning Sentence

Scratching Visual Transformer's Back with Uniform Attention

no code implementations ICCV 2023 Nam Hyeon-Woo, Kim Yu-Ji, Byeongho Heo, Doonyoon Han, Seong Joon Oh, Tae-Hyun Oh

We observe that the inclusion of CB reduces the degree of density in the original attention maps and increases both the capacity and generalizability of the ViT models.

HDR-Plenoxels: Self-Calibrating High Dynamic Range Radiance Fields

1 code implementation14 Aug 2022 Kim Jun-Seong, Kim Yu-Ji, Moon Ye-Bin, Tae-Hyun Oh

Our voxel-based volume rendering pipeline reconstructs HDR radiance fields with only multi-view LDR images taken from varying camera settings in an end-to-end manner and has a fast convergence speed.

Tone Mapping Vocal Bursts Intensity Prediction

Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transformers

1 code implementation27 Jul 2022 Junhyeong Cho, Kim Youwang, Tae-Hyun Oh

Transformer encoder architectures have recently achieved state-of-the-art results on monocular 3D human mesh reconstruction, but they require a substantial number of parameters and expensive computations.

3D Hand Pose Estimation 3D Reconstruction

CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes

1 code implementation9 Jun 2022 Kim Youwang, Kim Ji-Yeon, Tae-Hyun Oh

Then, our novel zero-shot neural style optimization detailizes and texturizes the recommended mesh sequence to conform to the prompt in a temporally-consistent and pose-agnostic manner.

Audio-Visual Fusion Layers for Event Type Aware Video Recognition

no code implementations12 Feb 2022 Arda Senocak, Junsik Kim, Tae-Hyun Oh, Hyeonggon Ryu, DIngzeyu Li, In So Kweon

Human brain is continuously inundated with the multisensory information and their complex interactions coming from the outside world at any given moment.

Multi-Task Learning Video Recognition +1

Unified 3D Mesh Recovery of Humans and Animals by Learning Animal Exercise

no code implementations3 Nov 2021 Kim Youwang, Kim Ji-Yeon, Kyungdon Joo, Tae-Hyun Oh

To make the unstable disjoint multi-task learning jointly trainable, we propose to exploit the morphological similarity between humans and animals, motivated by animal exercise where humans imitate animal poses.

Multi-Task Learning

FICGAN: Facial Identity Controllable GAN for De-identification

no code implementations2 Oct 2021 Yonghyun Jeong, Jooyoung Choi, Sungwon Kim, Youngmin Ro, Tae-Hyun Oh, Doyeon Kim, Heonseok Ha, Sungroh Yoon

In this work, we present Facial Identity Controllable GAN (FICGAN) for not only generating high-quality de-identified face images with ensured privacy protection, but also detailed controllability on attribute preservation for enhanced data utility.

Attribute De-identification

FoxInst: A Frustratingly Simple Baseline for Weakly Few-shot Instance Segmentation

no code implementations29 Sep 2021 Dongmin Choi, Moon Ye-Bin, Junsik Kim, Tae-Hyun Oh

We propose the first weakly-supervised few-shot instance segmentation task and a frustratingly simple but strong baseline model, FoxInst.

Instance Segmentation Semantic Segmentation

FedPara: Low-Rank Hadamard Product for Communication-Efficient Federated Learning

1 code implementation ICLR 2022 Nam Hyeon-Woo, Moon Ye-Bin, Tae-Hyun Oh

We show that pFedPara outperforms competing personalized FL methods with more than three times fewer parameters.

Federated Learning

CDS: Cross-Domain Self-Supervised Pre-Training

no code implementations ICCV 2021 Donghyun Kim, Kuniaki Saito, Tae-Hyun Oh, Bryan A. Plummer, Stan Sclaroff, Kate Saenko

We present a two-stage pre-training approach that improves the generalization ability of standard single-domain pre-training.

Domain Adaptation Transfer Learning

Distilling Global and Local Logits With Densely Connected Relations

1 code implementation ICCV 2021 Youmin Kim, Jinbae Park, YounHo Jang, Muhammad Ali, Tae-Hyun Oh, Sung-Ho Bae

In prevalent knowledge distillation, logits in most image recognition models are computed by global average pooling, then used to learn to encode the high-level and task-relevant knowledge.

Image Classification Knowledge Distillation +3

Dense Relational Image Captioning via Multi-task Triple-Stream Networks

1 code implementation8 Oct 2020 Dong-Jin Kim, Tae-Hyun Oh, Jinsoo Choi, In So Kweon

To this end, we propose the multi-task triple-stream network (MTTSNet) which consists of three recurrent units responsible for each POS which is trained by jointly predicting the correct captions and POS for each word.

Graph Generation Object +4

Monocular Reconstruction of Neural Face Reflectance Fields

no code implementations CVPR 2021 Mallikarjun B R., Ayush Tewari, Tae-Hyun Oh, Tim Weyrich, Bernd Bickel, Hans-Peter Seidel, Hanspeter Pfister, Wojciech Matusik, Mohamed Elgharib, Christian Theobalt

The reflectance field of a face describes the reflectance properties responsible for complex lighting effects including diffuse, specular, inter-reflection and self shadowing.

Monocular Reconstruction

Cross-domain Self-supervised Learning for Domain Adaptation with Few Source Labels

no code implementations18 Mar 2020 Donghyun Kim, Kuniaki Saito, Tae-Hyun Oh, Bryan A. Plummer, Stan Sclaroff, Kate Saenko

We show that when labeled source examples are limited, existing methods often fail to learn discriminative features applicable for both source and target domains.

Self-Supervised Learning Unsupervised Domain Adaptation

Image Captioning with Very Scarce Supervised Data: Adversarial Semi-Supervised Learning Approach

no code implementations IJCNLP 2019 Dong-Jin Kim, Jinsoo Choi, Tae-Hyun Oh, In So Kweon

To this end, our proposed semi-supervised learning method assigns pseudo-labels to unpaired samples via Generative Adversarial Networks to learn the joint distribution of image and caption.

Image Captioning

Neural Inverse Knitting: From Images to Manufacturing Instructions

1 code implementation7 Feb 2019 Alexandre Kaspar, Tae-Hyun Oh, Liane Makatura, Petr Kellnhofer, Jacqueline Aslarus, Wojciech Matusik

Motivated by the recent potential of mass customization brought by whole-garment knitting machines, we introduce the new problem of automatic machine instruction generation using a single image of the desired physical product, which we apply to machine knitting.

Noise-tolerant Audio-visual Online Person Verification using an Attention-based Neural Network Fusion

no code implementations27 Nov 2018 Suwon Shon, Tae-Hyun Oh, James Glass

In this paper, we present a multi-modal online person verification system using both speech and visual signals.

Globally Optimal Inlier Set Maximization for Atlanta Frame Estimation

no code implementations CVPR 2018 Kyungdon Joo, Tae-Hyun Oh, In So Kweon, Jean-Charles Bazin

In this work, we describe man-made structures via an appropriate structure assumption, called Atlanta world, which contains a vertical direction (typically the gravity direction) and a set of horizontal directions orthogonal to the vertical direction.

On Learning Associations of Faces and Voices

1 code implementation15 May 2018 Changil Kim, Hijung Valentina Shin, Tae-Hyun Oh, Alexandre Kaspar, Mohamed Elgharib, Wojciech Matusik

We computationally model the overlapping information between faces and voices and show that the learned cross-modal representation contains enough information to identify matching faces and voices with performance similar to that of humans.

Speaker Identification

Learning-based Video Motion Magnification

2 code implementations ECCV 2018 Tae-Hyun Oh, Ronnachai Jaroensri, Changil Kim, Mohamed Elgharib, Frédo Durand, William T. Freeman, Wojciech Matusik

We show that the learned filters achieve high-quality results on real videos, with less ringing artifacts and better noise characteristics than previous methods.

Motion Magnification

Learning to Localize Sound Source in Visual Scenes

no code implementations CVPR 2018 Arda Senocak, Tae-Hyun Oh, Junsik Kim, Ming-Hsuan Yang, In So Kweon

We show that even with a few supervision, false conclusion is able to be corrected and the source of sound in a visual scene can be localized effectively.

Gradient-based Camera Exposure Control for Outdoor Mobile Platforms

no code implementations24 Aug 2017 Inwook Shim, Tae-Hyun Oh, Joon-Young Lee, Jinwook Choi, Dong-Geol Choi, In So Kweon

We introduce a novel method to automatically adjust camera exposure for image processing and computer vision applications on mobile robot platforms.

Pedestrian Detection Stereo Matching +2

Contextually Customized Video Summaries via Natural Language

no code implementations6 Feb 2017 Jinsoo Choi, Tae-Hyun Oh, In So Kweon

Despite the challenging baselines, our method still manages to show comparable or even exceeding performance.

A Pseudo-Bayesian Algorithm for Robust PCA

no code implementations NeurIPS 2016 Tae-Hyun Oh, Yasuyuki Matsushita, In Kweon, David Wipf

Commonly used in many applications, robust PCA represents an algorithmic attempt to reduce the sensitivity of classical PCA to outliers.

Globally Optimal Manhattan Frame Estimation in Real-Time

no code implementations CVPR 2016 Kyungdon Joo, Tae-Hyun Oh, Junsik Kim, In So Kweon

Given a set of surface normals, we pose a Manhattan Frame (MF) estimation problem as a consensus set maximization that maximizes the number of inliers over the rotation search space.

Video Stabilization

Video-Story Composition via Plot Analysis

no code implementations CVPR 2016 Jinsoo Choi, Tae-Hyun Oh, In So Kweon

Inspired by plot analysis of written stories, our method generates a sequence of video clips ordered in such a way that it reflects plot dynamics and content coherency.

Optical Flow Estimation Patch Matching

Robust and Globally Optimal Manhattan Frame Estimation in Near Real Time

no code implementations12 May 2016 Kyungdon Joo, Tae-Hyun Oh, Junsik Kim, In So Kweon

Most man-made environments, such as urban and indoor scenes, consist of a set of parallel and orthogonal planar structures.

Clustering Video Stabilization

Human Attention Estimation for Natural Images: An Automatic Gaze Refinement Approach

no code implementations12 Jan 2016 Jinsoo Choi, Tae-Hyun Oh, In So Kweon

Photo collections and its applications today attempt to reflect user interactions in various forms.

Gaze Estimation

Pseudo-Bayesian Robust PCA: Algorithms and Analyses

no code implementations7 Dec 2015 Tae-Hyun Oh, Yasuyuki Matsushita, In So Kweon, David Wipf

Commonly used in computer vision and other applications, robust PCA represents an algorithmic attempt to reduce the sensitivity of classical PCA to outliers.

Matrix Completion

Fast Randomized Singular Value Thresholding for Low-rank Optimization

no code implementations1 Sep 2015 Tae-Hyun Oh, Yasuyuki Matsushita, Yu-Wing Tai, In So Kweon

The problems related to NNM, or WNNM, can be solved iteratively by applying a closed-form proximal operator, called Singular Value Thresholding (SVT), or Weighted SVT, but they suffer from high computational cost of Singular Value Decomposition (SVD) at each iteration.

Clustering

Fast Randomized Singular Value Thresholding for Nuclear Norm Minimization

no code implementations CVPR 2015 Tae-Hyun Oh, Yasuyuki Matsushita, Yu-Wing Tai, In So Kweon

The problems related to NNM (or WNNM) can be solved iteratively by applying a closed-form proximal operator, called Singular Value Thresholding (SVT) (or Weighted SVT), but they suffer from high computational cost to compute a Singular Value Decomposition (SVD) at each iteration.

Clustering

Partial Sum Minimization of Singular Values in Robust PCA: Algorithm and Applications

no code implementations4 Mar 2015 Tae-Hyun Oh, Yu-Wing Tai, Jean-Charles Bazin, Hyeongwoo Kim, In So Kweon

Robust Principal Component Analysis (RPCA) via rank minimization is a powerful tool for recovering underlying low-rank structure of clean data corrupted with sparse noise/outliers.

Edge Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.