no code implementations • 18 Jan 2024 • Li Sun, Liuan Wang, Jun Sun, Takayuki Okatani
This study introduces an innovative method to address event-level hallucinations in MLLMs, focusing on specific temporal understanding in video content.
1 code implementation • 7 Nov 2023 • Xiangyong Lu, Masanori Suganuma, Takayuki Okatani
For the first time, it achieves an ImageNet-1K top-1 accuracy of around 80% at a speed of 1. 0 frame/sec on the SBC.
1 code implementation • 7 Oct 2023 • Korawat Charoenpitaks, Van-Quang Nguyen, Masanori Suganuma, Masahiro Takahashi, Ryoma Niihara, Takayuki Okatani
To enable research in this understudied area, a new dataset named the DHPR (Driving Hazard Prediction and Reasoning) dataset is created.
no code implementations • 6 Jul 2023 • Han Zou, Masanori Suganuma, Takayuki Okatani
We can utilize an alternative shot of the identical scene, just like in video deblurring, or we can even employ a distinct image from another scene.
no code implementations • 6 Jul 2023 • Han Zou, Masanori Suganuma, Takayuki Okatani
Then, we propose an improved method, RefVSR++, which can aggregate two features in parallel in the temporal direction, one for aggregating the fused LR and Ref inputs and the other for Ref inputs over time.
Reference-based Video Super-Resolution Video Super-Resolution
no code implementations • 6 Jul 2023 • Jie Zhang, Masanori Suganuma, Takayuki Okatani
The local student, which is used in previous studies mainly focuses on structural anomaly detection while the global student pays attention to logical anomalies.
Ranked #13 on Anomaly Detection on MVTec LOCO AD
no code implementations • 6 Jul 2023 • Jie Zhang, Masanori Suganuma, Takayuki Okatani
They consider an unsupervised setting, specifically the one-class setting, in which we assume the availability of a set of normal (\textit{i. e.}, anomaly-free) images for training.
no code implementations • 18 Feb 2023 • Tatsuro Yamane, Pang-jo Chun, Ji Dang, Takayuki Okatani
For this, a VQA model was developed that uses bridge images for dataset creation and outputs the damage or member name and its existence based on the images and questions.
no code implementations • 23 Dec 2022 • Wenzheng Song, Ran Yan, Boshu Lei, Takayuki Okatani
In this study, we present a novel method called SuperGF, which effectively unifies local and global features for visual localization, leading to a higher trade-off between localization accuracy and computational efficiency.
no code implementations • 20 Jul 2022 • Yusuke Hosoya, Masanori Suganuma, Takayuki Okatani
In this paper, we first point out that the recent studies' formalization of OSOD, which generalizes open-set recognition (OSR) and thus considers an unlimited variety of unknown objects, has a fundamental issue.
2 code implementations • 20 Jul 2022 • Van-Quang Nguyen, Masanori Suganuma, Takayuki Okatani
Current state-of-the-art methods for image captioning employ region-based features, as they provide object-level information that is essential to describe the content of images; they are usually extracted by an object detector such as Faster R-CNN.
Ranked #8 on Image Captioning on nocaps in-domain
no code implementations • 7 Jul 2022 • Qian Ye, Masanori Suganuma, Takayuki Okatani
Considering the spatial variant property of the defocus blur and the blur level indicated in the defocus map, we employ the defocus map as conditional guidance to adjust the features from the input blurring images instead of simple concatenation.
no code implementations • 6 Jul 2022 • Qian Ye, Masanori Suganuma, Jun Xiao, Takayuki Okatani
Reconstructing ghosting-free high dynamic range (HDR) images of dynamic scenes from a set of multi-exposure images is a challenging task, especially with large object motion and occlusions, leading to visible artifacts using existing methods.
2 code implementations • 30 Jun 2022 • Zhijie Wang, Masanori Suganuma, Takayuki Okatani
Due to its high annotation cost, researchers have developed many UDA methods for semantic segmentation, which assume no labeled sample is available in the target domain.
no code implementations • CVPR 2022 • Shuang Liu, Takayuki Okatani
We then propose a network design for the actor and the critic to inherently attain these symmetries.
no code implementations • 17 Dec 2021 • Shuang Liu, Takayuki Okatani
We then propose a network design for the actor and the critic to inherently attain these symmetries.
no code implementations • 14 Sep 2021 • Zhijie Wang, Masanori Suganuma, Takayuki Okatani
This study is concerned with few-shot segmentation, i. e., segmenting the region of an unseen object class in a query image, given support image(s) of its instances.
no code implementations • 14 Sep 2021 • Zhijie Wang, Xing Liu, Masanori Suganuma, Takayuki Okatani
To cope with this, we propose a method that applies adversarial training to align two feature distributions in the target domain.
Ranked #1 on Domain Adaptation on Synscapes-to-Cityscapes
no code implementations • ICCV 2021 • Wenzheng Song, Masanori Suganuma, Xing Liu, Noriyuki Shimobayashi, Daisuke Maruta, Takayuki Okatani
To consider if and how well we can utilize such information stored in RAW-format images for image matching, we have created a new dataset named MID (matching in the dark).
1 code implementation • 19 Aug 2021 • Qian Ye, Jun Xiao, Kin-Man Lam, Takayuki Okatani
We propose a novel method that can better fuse the features based on two ideas.
1 code implementation • 1 Jun 2021 • Van-Quang Nguyen, Masanori Suganuma, Takayuki Okatani
It then integrates the prediction with the visual information etc., yielding the final prediction of an action and an object.
no code implementations • 9 Jan 2021 • Liang Xu, Taro Hatsutani, Xing Liu, Engkarat Techapanurak, Han Zou, Takayuki Okatani
We experimentally show that this makes it possible to detect cracks from an image of one-third the resolution of images used for annotation with about the same accuracy.
no code implementations • 7 Jan 2021 • Engkarat Techapanurak, Anh-Chuong Dang, Takayuki Okatani
We estimate where the generated samples by a single image transformation lie between ID and OOD using a network trained on clean ID samples.
no code implementations • 7 Jan 2021 • Engkarat Techapanurak, Takayuki Okatani
We reconsider the evaluation of OOD detection methods for image recognition.
no code implementations • ICCV 2021 • Tetsuya Tanaka, Yukihiro Sasagawa, Takayuki Okatani
Bundle adjustment (BA) occupies a large portion of SfM and visual SLAM's total execution time.
no code implementations • 7 May 2020 • Rito Murase, Masanori Suganuma, Takayuki Okatani
We draw a mixed conclusion from the experimental results; the positional encoding certainly works in some cases, but the absolute image position may not be so important for segmentation tasks as we think.
1 code implementation • ECCV 2020 • Van-Quang Nguyen, Masanori Suganuma, Takayuki Okatani
It has been a primary concern in recent studies of vision and language tasks to design an effective attention mechanism dealing with interactions between the two modalities.
Ranked #7 on Visual Dialog on Visual Dialog v1.0 test-std
no code implementations • 20 Nov 2019 • Junjie Hu, Takayuki Okatani
However, the prediction of saliency maps is itself vulnerable to the attacks, even though it is not the direct target of the attacks.
no code implementations • 21 Oct 2019 • Yusuke Hosoya, Masanori Suganuma, Takayuki Okatani
The employment of convolutional neural networks has led to significant performance improvement on the task of object detection.
1 code implementation • 10 Jul 2019 • Xing Liu, Masanori Suganuma, Xiyang Luo, Takayuki Okatani
The employment of convolutional neural networks has achieved unprecedented performance in the task of image restoration for a variety of degradation factors.
no code implementations • 30 May 2019 • Xing Liu, Takayuki Okatani
There is another type of tasks for which what to predict is human perception itself, in which there are often individual differences.
1 code implementation • 25 May 2019 • Engkarat Techapanurak, Masanori Suganuma, Takayuki Okatani
The ability to detect out-of-distribution (OOD) samples is vital to secure the reliability of deep neural networks in real-world applications.
1 code implementation • 14 May 2019 • Mingzhen Shao, Zhun Sun, Mete Ozay, Takayuki Okatani
We address a problem of estimating pose of a person's head from its RGB image.
1 code implementation • ICCV 2019 • Junjie Hu, Yan Zhang, Takayuki Okatani
We formulate it as an optimization problem of identifying the smallest number of image pixels from which the CNN can estimate a depth map with the minimum difference from the estimate from the entire image.
1 code implementation • CVPR 2019 • Xing Liu, Masanori Suganuma, Zhun Sun, Takayuki Okatani
In this paper, we study design of deep neural networks for tasks of image restoration.
no code implementations • 15 Jan 2019 • Pongsate Tangseng, Takayuki Okatani
For this purpose, we propose a method for quantifying how influential each feature of each item is to the score.
no code implementations • CVPR 2019 • Duy-Kien Nguyen, Takayuki Okatani
The representation is hierarchical, and prediction for each task is computed from the representation at its corresponding level of the hierarchy.
1 code implementation • CVPR 2019 • Masanori Suganuma, Xing Liu, Takayuki Okatani
There are many different types of distortion which affect image quality.
no code implementations • CVPR 2018 • Zhun Sun, Mete Ozay, Yan Zhang, Xing Liu, Takayuki Okatani
In this work, we address the problem of improving robustness of convolutional neural networks (CNNs) to image distortion.
no code implementations • 26 Apr 2018 • Pongsate Tangseng, Kota Yamaguchi, Takayuki Okatani
We consider grading a fashion outfit for recommendation, where we assume that users have a closet of items and we aim at producing a score for an arbitrary combination of items in the closet.
1 code implementation • CVPR 2018 • Duy-Kien Nguyen, Takayuki Okatani
A key solution to visual question answering (VQA) exists in how to fuse visual and language features extracted from an input image and question.
4 code implementations • 23 Mar 2018 • Junjie Hu, Mete Ozay, Yan Zhang, Takayuki Okatani
Experimental results show that these two improvements enable to attain higher accuracy than the current state-of-the-arts, which is given by finer resolution reconstruction, for example, with small objects and object boundaries.
Ranked #60 on Monocular Depth Estimation on NYU-Depth V2
1 code implementation • ICML 2018 • Masanori Suganuma, Mete Ozay, Takayuki Okatani
Researchers have applied deep neural networks to image restoration tasks, in which they proposed various network architectures, loss functions, and training methods.
1 code implementation • 24 Jan 2018 • Fazil Altinel, Mete Ozay, Takayuki Okatani
In this paper, we propose a structured image inpainting method employing an energy based model.
no code implementations • 12 Dec 2017 • Shuang Liu, Mete Ozay, Takayuki Okatani, Hongli Xu, Kai Sun, Yang Lin
In the experiments, we first evaluate performance of the proposed detection module on UDID and its deformed variations.
no code implementations • 6 Nov 2017 • Zhun Sun, Mete Ozay, Takayuki Okatani
This problem was addressed by employing several defense methods for detection and rejection of particular types of attacks.
no code implementations • 6 Aug 2017 • Kota Yamaguchi, Takayuki Okatani, Takayuki Umeda, Kazuhiko Murasaki, Kyoko Sudo
We present a structured inference approach in deep neural networks for multiple attribute prediction.
no code implementations • 25 Jul 2017 • Zhun Sun, Mete Ozay, Takayuki Okatani
In this work, we address the problem of improvement of robustness of feature representations learned using convolutional neural networks (CNNs) to image deformation.
no code implementations • 25 Jul 2017 • Zhun Sun, Mete Ozay, Takayuki Okatani
We develop a novel method for training of GANs for unsupervised and class conditional generation of images, called Linear Discriminant GAN (LD-GAN).
no code implementations • 14 Jun 2017 • Yan Zhang, Mete Ozay, Zhun Sun, Takayuki Okatani
In order to estimate the entropy of the encoding variables and the mutual information, we propose a non-parametric method.
1 code implementation • ICCV 2017 • Yan Zhang, Mete Ozay, Shuo-Hao Li, Takayuki Okatani
By employing the proposed architecture on a baseline wide network, we can construct and train a new network with same depth but considerably less number of parameters.
no code implementations • 22 Jan 2017 • Mete Ozay, Takayuki Okatani
The results show that geometric adaptive step size computation methods of G-SGD can improve training loss and convergence properties of CNNs.
no code implementations • CVPR 2017 • Eisuke Ito, Takayuki Okatani
In this paper we consider critical motion sequences (CMSs) of rolling-shutter (RS) SfM.
no code implementations • 22 Oct 2016 • Mete Ozay, Takayuki Okatani
Following our theoretical results, we propose a SGD algorithm with assurance of almost sure convergence of the methods to a solution at single minimum of classification loss of CNNs.
1 code implementation • 25 Jul 2016 • Sirion Vittayakorn, Takayuki Umeda, Kazuhiko Murasaki, Kyoko Sudo, Takayuki Okatani, Kota Yamaguchi
This paper proposes an automatic approach to discover and analyze visual attributes from a noisy collection of image-text data on the Web.
1 code implementation • 30 Nov 2015 • Zhun Sun, Mete Ozay, Takayuki Okatani
Despite the effectiveness of Convolutional Neural Networks (CNNs) for image classification, our understanding of the relationship between shape of convolution kernels and learned representations is limited.
no code implementations • 20 Nov 2015 • Yan Zhang, Mete Ozay, Xing Liu, Takayuki Okatani
We propose a method for integration of features extracted using deep representations of Convolutional Neural Networks (CNNs) each of which is learned using a different image dataset of objects and materials for material recognition.
no code implementations • CVPR 2015 • Masaki Saito, Takayuki Okatani
Although downsizing MRFs should directly reduce the computational cost, there is no systematic way of doing this, since it is unclear how to obtain the MRF energy for the downsized MRFs and also how to translate the estimates of their marginal distributions to those of the original MRFs.
no code implementations • CVPR 2013 • Ken Sakurada, Takayuki Okatani, Koichiro Deguchi
The proposed method is compared with the methods that use multi-view stereo (MVS) to reconstruct the scene structures of the two time points and then differentiate them to detect changes.
no code implementations • CVPR 2013 • Masaki Saito, Takayuki Okatani, Koichiro Deguchi
In this paper, we show a novel formulation for this continuous-discrete conversion.