no code implementations • 21 Sep 2023 • Elena Shushkevich, Long Mai, Manuel V. Loureiro, Steven Derby, Tri Kurniawan Wijaya
Nowadays, the use of intelligent systems to detect redundant information in news articles has become especially prevalent with the proliferation of news media outlets in order to enhance user experience.
no code implementations • 2 Sep 2023 • Hanshu Yan, Jun Hao Liew, Long Mai, Shanchuan Lin, Jiashi Feng
The flexibility of these techniques enables the editing of arbitrary regions within the frame.
no code implementations • 19 Jul 2023 • Long Mai, Julie Carson-Berndsen
The integration of natural language processing (NLP) technologies into educational applications has shown promising results, particularly in the language learning domain.
no code implementations • 24 Sep 2022 • Long Mai, Julie Carson-Berndsen
The transcription quality of automatic speech recognition (ASR) systems degrades significantly when transcribing audios coming from unseen domains.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • CVPR 2022 • Long Mai, Feng Liu
The model is trained end-to-end on a video to jointly determine the phase-shift values at each time with the mapping from the phase-shifted sinusoidal functions to the corresponding frame, enabling an implicit video representation.
1 code implementation • 22 Oct 2021 • Thang M. Pham, Trung Bui, Long Mai, Anh Nguyen
We find two reasons why IM is not better than LOO: (1) deleting a single word from the input only marginally reduces a classifier's accuracy; and (2) a highly predictable word is always given near-zero attribution, regardless of its true importance to the classifier.
1 code implementation • 15 Jun 2021 • Alexander Black, Tu Bui, Long Mai, Hailin Jin, John Collomosse
We present an algorithm for searching image collections using free-hand sketches that describe the appearance and relative positions of multiple objects.
1 code implementation • 3 Jun 2021 • Juan Leon Alcazar, Long Mai, Federico Perazzi, Joon-Young Lee, Pablo Arbelaez, Bernard Ghanem, Fabian Caba Heilbron
To showcase the potential of our new dataset, we propose an audiovisual baseline and benchmark for person retrieval.
1 code implementation • CVPR 2021 • S. Mahdi H. Miangoleh, Sebastian Dille, Long Mai, Sylvain Paris, Yağız Aksoy
Neural networks have shown great abilities in estimating depth from a single image.
Ranked #1 on Monocular Depth Estimation on IBims-1
no code implementations • Findings (ACL) 2021 • Thang M. Pham, Trung Bui, Long Mai, Anh Nguyen
Encouraging classifiers to capture word order information improves the performance on most GLUE tasks, SQuAD 2. 0 and out-of-samples.
Natural Language Inference Natural Language Understanding +2
1 code implementation • CVPR 2021 • Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Long Mai, Simon Chen, Chunhua Shen
Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due to an unknown depth shift induced by shift-invariant reconstruction losses used in mixed-data depth prediction training, and possible unknown camera focal length.
Ranked #1 on Indoor Monocular Depth Estimation on DIODE (using extra training data)
no code implementations • 2 Nov 2020 • Simon Niklaus, Long Mai, Oliver Wang
Video frame interpolation, the synthesis of novel views in time, is an increasingly popular research direction with many new papers further advancing the state of the art.
1 code implementation • CVPR 2020 • Juan Leon Alcazar, Fabian Caba Heilbron, Long Mai, Federico Perazzi, Joon-Young Lee, Pablo Arbelaez, Bernard Ghanem
Current methods for active speak er detection focus on modeling short-term audiovisual information from a single speaker.
no code implementations • CVPR 2020 • Zhuowan Li, Quan Tran, Long Mai, Zhe Lin, Alan Yuille
In this paper, we introduce a new task, context-aware group captioning, which aims to describe a group of target images in the context of another group of related reference images.
1 code implementation • NeurIPS 2020 • Thu Nguyen-Phuoc, Christian Richardt, Long Mai, Yong-Liang Yang, Niloy Mitra
Our experiments show that using explicit 3D features to represent objects allows BlockGAN to learn disentangled representations both in terms of objects (foreground and background) and their properties (pose and identity).
no code implementations • 10 Oct 2019 • Qi Li, Long Mai, Michael A. Alcorn, Anh Nguyen
Large, pre-trained generative models have been increasingly popular and useful to both the research and wider communities.
1 code implementation • ICCV 2019 • Haotian Zhang, Long Mai, Ning Xu, Zhaowen Wang, John Collomosse, Hailin Jin
We propose a novel video inpainting algorithm that simultaneously hallucinates missing appearance and motion (optical flow) information, building upon the recent 'Deep Image Prior' (DIP) that exploits convolutional network architectures to enforce plausible texture in static images.
4 code implementations • 12 Sep 2019 • Simon Niklaus, Long Mai, Jimei Yang, Feng Liu
According to this depth estimate, our framework then maps the input image to a point cloud and synthesizes the resulting video frames by rendering the point cloud from the corresponding camera positions.
Ranked #4 on Depth Estimation on NYU-Depth V2
no code implementations • 3 Apr 2019 • Peng Zhou, Long Mai, Jianming Zhang, Ning Xu, Zuxuan Wu, Larry S. Davis
Instead of sequentially distilling knowledge only from the last model, we directly leverage all previous model snapshots.
1 code implementation • CVPR 2019 • Michael A. Alcorn, Qi Li, Zhitao Gong, Chengfei Wang, Long Mai, Wei-Shinn Ku, Anh Nguyen
Using our framework and a self-assembled dataset of 3D objects, we investigate the vulnerability of DNNs to OoD poses of well-known objects in ImageNet.
no code implementations • ECCV 2018 • Hoang Le, Long Mai, Brian Price, Scott Cohen, Hailin Jin, Feng Liu
Instead of relying on pre-defined low-level image features, our method adaptively predicts object boundaries according to image content and user interactions.
6 code implementations • ICCV 2017 • Simon Niklaus, Long Mai, Feng Liu
Our method develops a deep fully convolutional neural network that takes two input frames and estimates pairs of 1D kernels for all pixels simultaneously.
Ranked #8 on Video Frame Interpolation on Middlebury
no code implementations • CVPR 2017 • Long Mai, Hailin Jin, Zhe Lin, Chen Fang, Jonathan Brandt, Feng Liu
We train a convolutional neural network to synthesize appropriate visual features that captures the spatial-semantic constraints from the user canvas query.
1 code implementation • CVPR 2017 • Simon Niklaus, Long Mai, Feng Liu
Video frame interpolation typically involves two steps: motion estimation and pixel synthesis.
no code implementations • CVPR 2016 • Long Mai, Hailin Jin, Feng Liu
Deep convolutional neural network (ConvNet) methods have recently shown promising results for aesthetics assessment.
Ranked #6 on Aesthetics Quality Assessment on AVA
no code implementations • CVPR 2015 • Long Mai, Feng Liu
This Gaussian Conditional Random Fields-based kernel fusion method not only models how individual kernels are fused at each kernel element but also the interaction of kernel fusion among multiple kernel elements.
no code implementations • CVPR 2013 • Long Mai, Yuzhen Niu, Feng Liu
Our idea is to use data-driven approaches to saliency aggregation that appropriately consider the performance gaps among individual methods and the performance dependence of each method on individual images.