Search Results for author: Hirokatsu Kataoka

Found 48 papers, 19 papers with code

Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models

no code implementations • 15 Apr 2024 • Peifei Zhu, Tsubasa Takahashi, Hirokatsu Kataoka

Diffusion Models (DMs) have shown remarkable capabilities in various image-generation tasks.

Paper
Add Code

Primitive Geometry Segment Pre-training for 3D Medical Image Segmentation

1 code implementation • 8 Jan 2024 • Ryu Tadokoro, Ryosuke Yamada, Kodai Nakashima, Ryo Nakamura, Hirokatsu Kataoka

From experimental results, we conclude that effective pre-training can be achieved by looking at primitive geometric objects only.

Image Segmentation Medical Image Segmentation +3

Paper
Code

Traffic Incident Database with Multiple Labels Including Various Perspective Environmental Information

1 code implementation • 17 Dec 2023 • Shota Nishiyama, Takuma Saito, Ryo Nakamura, Go Ohtani, Hirokatsu Kataoka, Kensho Hara

Our proposed dataset aims to improve the performance of traffic accident recognition by annotating ten types of environmental information as teacher labels in addition to the presence or absence of traffic accidents.

Paper
Code

Image Generation and Learning Strategy for Deep Document Forgery Detection

no code implementations • 7 Nov 2023 • Yamato Okamoto, Osada Genki, Iu Yahiro, Rintaro Hasegawa, Peifei Zhu, Hirokatsu Kataoka

In recent years, document processing has flourished and brought numerous benefits.

Image Generation Self-Supervised Learning

Paper
Add Code

Leveraging Image-Text Similarity and Caption Modification for the DataComp Challenge: Filtering Track and BYOD Track

no code implementations • 23 Oct 2023 • Shuhei Yokoo, Peifei Zhu, Yuchi Ishikawa, Mikihiro Tanaka, Masayoshi Kondo, Hirokatsu Kataoka

Our solution adopts large multimodal models CLIP and BLIP-2 to filter and modify web crawl data, and utilize external datasets along with a bag of tricks to improve the data quality.

text similarity

Paper
Add Code

Constructing Image-Text Pair Dataset from Books

no code implementations • 3 Oct 2023 • Yamato Okamoto, Haruto Toyonaga, Yoshihisa Ijiri, Hirokatsu Kataoka

Digital archiving is becoming widespread owing to its effectiveness in protecting valuable books and providing knowledge to many people electronically.

Optical Character Recognition (OCR) Retrieval +1

Paper
Add Code

SegRCDB: Semantic Segmentation via Formula-Driven Supervised Learning

1 code implementation • ICCV 2023 • Risa Shinoda, Ryo Hayamizu, Kodai Nakashima, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka

SegRCDB has a high potential to contribute to semantic segmentation pre-training and investigation by enabling the creation of large datasets without manual annotation.

Segmentation Semantic Segmentation

Paper
Code

Diffusion-based Holistic Texture Rectification and Synthesis

no code implementations • 26 Sep 2023 • Guoqing Hao, Satoshi Iizuka, Kensho Hara, Edgar Simo-Serra, Hirokatsu Kataoka, Kazuhiro Fukui

We present a novel framework for rectifying occlusions and distortions in degraded texture samples from natural images.

Texture Synthesis

Paper
Add Code

Exploring Limits of Diffusion-Synthetic Training with Weakly Supervised Semantic Segmentation

no code implementations • 4 Sep 2023 • Ryota Yoshihashi, Yuya Otsuka, Kenji Doi, Tomohiro Tanaka, Hirokatsu Kataoka

The advance of generative models for images has inspired various training techniques for image recognition utilizing synthetic images.

Data Augmentation Image Generation +5

Paper
Add Code

Pre-training Vision Transformers with Very Limited Synthesized Images

1 code implementation • ICCV 2023 • Ryo Nakamura, Hirokatsu Kataoka, Sora Takashima, Edgar Josafat Martinez Noriega, Rio Yokota, Nakamasa Inoue

Prior work on FDSL has shown that pre-training vision transformers on such synthetic datasets can yield competitive accuracy on a wide range of downstream tasks.

Data Augmentation

Paper
Code

RoseTracker: A system for automated rose growth monitoring

no code implementations • Smart Agricultural Technology 2023 • Risa Shinoda, Ko Motoki, Kensho Hara, Hirokatsu Kataoka, Ryohei Nakano, Tetsuya Nakazaki, Ryozo Noguchi

The RoseBlooming dataset is the innovative dataset of labeled images for cut flowers at the growing stage.

object-detection Object Detection

Paper
Add Code

Pre-Training Auto-Generated Volumetric Shapes for 3D Medical Image Segmentation

1 code implementation • CVPR Workshop 2023 • Ryu Tadokoro, Ryosuke Yamada, Hirokatsu Kataoka

Inspired by this approach, we propose the Auto-generated Volumetric Shapes Database (AVS-DB) for data-scarce 3D medical image segmentation tasks.

Image Segmentation Medical Image Segmentation +3

Paper
Code

Scapegoat Generation for Privacy Protection from Deepfake

no code implementations • 6 Mar 2023 • Gido Kato, Yoshihiro Fukuhara, Mariko Isogawa, Hideki Tsunashima, Hirokatsu Kataoka, Shigeo Morishima

To protect privacy and prevent malicious use of deepfake, current studies propose methods that interfere with the generation process, such as detection and destruction approaches.

Face Swapping

Paper
Add Code

Visual Atoms: Pre-training Vision Transformers with Sinusoidal Waves

no code implementations • CVPR 2023 • Sora Takashima, Ryo Hayamizu, Nakamasa Inoue, Hirokatsu Kataoka, Rio Yokota

Unlike JFT-300M which is a static dataset, the quality of synthetic datasets will continue to improve, and the current work is a testament to this possibility.

Paper
Add Code

Frequency-aware GAN for Adversarial Manipulation Generation

no code implementations • ICCV 2023 • Peifei Zhu, Genki Osada, Hirokatsu Kataoka, Tsubasa Takahashi

We observe that existing spatial attacks cause large degradation in image quality and find the loss of high-frequency detailed components might be its major reason.

Adversarial Attack Image Manipulation

Paper
Add Code

Graph Representation for Order-Aware Visual Transformation

no code implementations • CVPR 2023 • Yue Qiu, Yanjun Sun, Fumiya Matsuzawa, Kenji Iwata, Hirokatsu Kataoka

This paper proposes a new visual reasoning formulation that aims at discovering changes between image pairs and their temporal orders.

Visual Reasoning

Paper
Add Code

Neural Density-Distance Fields

1 code implementation • 29 Jul 2022 • Itsuki Ueda, Yoshihiro Fukuhara, Hirokatsu Kataoka, Hiroaki Aizawa, Hidehiko Shishido, Itaru Kitahara

However, it is difficult to achieve high localization performance by only density fields-based methods such as Neural Radiance Field (NeRF) since they do not provide density gradient in most empty regions.

Novel View Synthesis Visual Localization

211

Paper
Code

Replacing Labeled Real-image Datasets with Auto-generated Contours

no code implementations • CVPR 2022 • Hirokatsu Kataoka, Ryo Hayamizu, Ryosuke Yamada, Kodai Nakashima, Sora Takashima, Xinyu Zhang, Edgar Josafat Martinez-Noriega, Nakamasa Inoue, Rio Yokota

In the present work, we show that the performance of formula-driven supervised learning (FDSL) can match or even exceed that of ImageNet-21k without the use of real images, human-, and self-supervision during the pre-training of Vision Transformers (ViTs).

Paper
Add Code

Community-Driven Comprehensive Scientific Paper Summarization: Insight from cvpaper.challenge

no code implementations • 17 Mar 2022 • Shintaro Yamamoto, Hirokatsu Kataoka, Ryota Suzuki, Seitaro Shinagawa, Shigeo Morishima

To alleviate this problem, we organized a group of non-native English speakers to write summaries of papers presented at a computer vision conference to share the knowledge of the papers read by the group.

Paper
Add Code

Point Cloud Pre-Training With Natural 3D Structures

1 code implementation • CVPR 2022 • Ryosuke Yamada, Hirokatsu Kataoka, Naoya Chiba, Yukiyasu Domae, Tetsuya OGATA

Moreover, the PC-FractalDB pre-trained model is especially effective in training with limited data.

Ranked #18 on 3D Object Detection on SUN-RGBD val (using extra training data)

3D Object Detection object-detection +2

Paper
Code

Describing and Localizing Multiple Changes with Transformers

2 code implementations • ICCV 2021 • Yue Qiu, Shintaro Yamamoto, Kodai Nakashima, Ryota Suzuki, Kenji Iwata, Hirokatsu Kataoka, Yutaka Satoh

Change captioning tasks aim to detect changes in image pairs observed before and after a scene change and generate a natural language description of the changes.

Paper
Code

Can Vision Transformers Learn without Natural Images?

1 code implementation • 24 Mar 2021 • Kodai Nakashima, Hirokatsu Kataoka, Asato Matsumoto, Kenji Iwata, Nakamasa Inoue

Moreover, although the ViT pre-trained without natural images produces some different visualizations from ImageNet pre-trained ViT, it can interpret natural image datasets to a large extent.

Fairness Self-Supervised Learning

194

Paper
Code

Pre-training without Natural Images

2 code implementations • 21 Jan 2021 • Hirokatsu Kataoka, Kazushige Okayasu, Asato Matsumoto, Eisuke Yamagata, Ryosuke Yamada, Nakamasa Inoue, Akio Nakamura, Yutaka Satoh

Is it possible to use convolutional neural networks pre-trained without any natural images to assist natural image understanding?

194

Paper
Code

Initialization Using Perlin Noise for Training Networks with a Limited Amount of Data

no code implementations • 19 Jan 2021 • Nakamasa Inoue, Eisuke Yamagata, Hirokatsu Kataoka

Our main idea is to initialize the network parameters by solving an artificial noise classification problem, where the aim is to classify Perlin noise samples into their noise categories.

Classification General Classification +1

Paper
Add Code

Alleviating Over-segmentation Errors by Detecting Action Boundaries

2 code implementations • 14 Jul 2020 • Yuchi Ishikawa, Seito Kasai, Yoshimitsu Aoki, Hirokatsu Kataoka

Our model architecture consists of a long-term feature extractor and two branches: the Action Segmentation Branch (ASB) and the Boundary Regression Branch (BRB).

Ranked #9 on Action Segmentation on GTEA

Action Classification Action Segmentation +2

Paper
Code

Retrieving and Highlighting Action with Spatiotemporal Reference

1 code implementation • 19 May 2020 • Seito Kasai, Yuchi Ishikawa, Masaki Hayashi, Yoshimitsu Aoki, Kensho Hara, Hirokatsu Kataoka

In this paper, we present a framework that jointly retrieves and spatiotemporally highlights actions in videos by enhancing current deep cross-modal retrieval methods.

Action Recognition Cross-Modal Retrieval +5

Paper
Code

Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs?

10 code implementations • 10 Apr 2020 • Hirokatsu Kataoka, Tenga Wakamiya, Kensho Hara, Yutaka Satoh

Therefore, in the present paper, we conduct exploration study in order to improve spatiotemporal 3D CNNs as follows: (i) Recently proposed large-scale video datasets help improve spatiotemporal 3D CNNs in terms of video classification accuracy.

General Classification Open-Ended Question Answering +2

3,816

Paper
Code

Weakly Supervised Dataset Collection for Robust Person Detection

1 code implementation • 27 Mar 2020 • Munetaka Minoguchi, Ken Okayama, Yutaka Satoh, Hirokatsu Kataoka

To construct an algorithm that can provide robust person detection, we present a dataset with over 8 million images that was produced in a weakly supervised manner.

Human Detection

Paper
Code

Augmented Cyclic Consistency Regularization for Unpaired Image-to-Image Translation

no code implementations • 29 Feb 2020 • Takehiko Ohkawa, Naoto Inoue, Hirokatsu Kataoka, Nakamasa Inoue

Herein, we propose Augmented Cyclic Consistency Regularization (ACCR), a novel regularization method for unpaired I2I translation.

Data Augmentation Image-to-Image Translation +1

Paper
Add Code

Learning with Protection: Rejection of Suspicious Samples under Adversarial Environment

no code implementations • 25 Sep 2019 • Masahiro Kato, Yoshihiro Fukuhara, Hirokatsu Kataoka, Shigeo Morishima

Our main idea is to apply a framework of learning with rejection and adversarial examples to assist in the decision making for such suspicious samples.

BIG-bench Machine Learning Binary Classification +3

Paper
Add Code

What Do Adversarially Robust Models Look At?

1 code implementation • 19 May 2019 • Takahiro Itazuri, Yoshihiro Fukuhara, Hirokatsu Kataoka, Shigeo Morishima

In this paper, we address the open question: "What do adversarially robust models look at?"

Adversarial Robustness Open-Ended Question Answering

Paper
Code

Automatic Paper Summary Generation from Visual and Textual Information

no code implementations • 16 Nov 2018 • Shintaro Yamamoto, Yoshihiro Fukuhara, Ryota Suzuki, Shigeo Morishima, Hirokatsu Kataoka

Due to the recent boom in artificial intelligence (AI) research, including computer vision (CV), it has become impossible for researchers in these fields to keep up with the exponentially increasing number of manuscripts.

Sentence

Paper
Add Code

Understanding Fake Faces

no code implementations • 22 Sep 2018 • Ryota Natsume, Kazuki Inoue, Yoshihiro Fukuhara, Shintaro Yamamoto, Shigeo Morishima, Hirokatsu Kataoka

Face recognition research is one of the most active topics in computer vision (CV), and deep neural networks (DNN) are now filling the gap between human-level and computer-driven performance levels in face verification algorithms.

Face Recognition Face Verification

Paper
Add Code

Neural Joking Machine : Humorous image captioning

no code implementations • 30 May 2018 • Kota Yoshida, Munetaka Minoguchi, Kenichiro Wani, Akio Nakamura, Hirokatsu Kataoka

In the present paper, in order to consider this question from an academic standpoint, we generate an image caption that draws a "laugh" by a computer.

Image Captioning

Paper
Add Code

Anticipating Traffic Accidents with Adaptive Loss and Large-scale Incident DB

no code implementations • CVPR 2018 • Tomoyuki Suzuki, Hirokatsu Kataoka, Yoshimitsu Aoki, Yutaka Satoh

In this paper, we propose a novel approach for traffic accident anticipation through (i) Adaptive Loss for Early Anticipation (AdaLEA) and (ii) a large-scale self-annotated incident database for anticipation.

Accident Anticipation

Paper
Add Code

Drive Video Analysis for the Detection of Traffic Near-Miss Incidents

no code implementations • 7 Apr 2018 • Hirokatsu Kataoka, Teppei Suzuki, Shoko Oikawa, Yasuhiro Matsui, Yutaka Satoh

Because of their recent introduction, self-driving cars and advanced driver assistance system (ADAS) equipped vehicles have had little opportunity to learn, the dangerous traffic (including near-miss incident) scenarios that provide normal drivers with strong motivation to drive safely.

Self-Driving Cars

Paper
Add Code

Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?

26 code implementations • CVPR 2018 • Kensho Hara, Hirokatsu Kataoka, Yutaka Satoh

The purpose of this study is to determine whether current video datasets have sufficient data for training very deep convolutional neural networks (CNNs) with spatio-temporal three-dimensional (3D) kernels.

Ranked #49 on Action Recognition on UCF101

Action Recognition

3,816

Paper
Code

Learning Spatio-Temporal Features with 3D Residual Networks for Action Recognition

1 code implementation • 25 Aug 2017 • Kensho Hara, Hirokatsu Kataoka, Yutaka Satoh

The 3D ResNets trained on the Kinetics did not suffer from overfitting despite the large number of parameters of the model, and achieved better performance than relatively shallow networks, such as C3D.

Action Recognition Hand-Gesture Recognition +1

112

Paper
Code

cvpaper.challenge in 2016: Futuristic Computer Vision through 1,600 Papers Survey

2 code implementations • 20 Jul 2017 • Hirokatsu Kataoka, Soma Shirakabe, Yun He, Shunya Ueta, Teppei Suzuki, Kaori Abe, Asako Kanezaki, Shin'ichiro Morita, Toshiyuki Yabe, Yoshihiro Kanehara, Hiroya Yatsuyanagi, Shinya Maruyama, Ryosuke Takasawa, Masataka Fuchida, Yudai Miyashita, Kazushige Okayasu, Yuta Matsuzaki

The paper gives futuristic challenges disscussed in the cvpaper. challenge.

Paper
Code

Collaborative Descriptors: Convolutional Maps for Preprocessing

no code implementations • 10 May 2017 • Hirokatsu Kataoka, Kaori Abe, Akio Nakamura, Yutaka Satoh

The paper presents a novel concept for collaborative descriptors between deeply learned and hand-crafted features.

Object Recognition

Paper
Add Code

Could you guess an interesting movie from the posters?: An evaluation of vision-based features on movie poster database

no code implementations • 7 Apr 2017 • Yuta Matsuzaki, Kazushige Okayasu, Takaaki Imanari, Naomichi Kobayashi, Yoshihiro Kanehara, Ryousuke Takasawa, Akio Nakamura, Hirokatsu Kataoka

In this paper, we aim to estimate the Winner of world-wide film festival from the exhibited movie poster.

Movie Recommendation

Paper
Add Code

Changing Fashion Cultures

3 code implementations • 23 Mar 2017 • Kaori Abe, Teppei Suzuki, Shunya Ueta, Akio Nakamura, Yutaka Satoh, Hirokatsu Kataoka

The paper presents a novel concept that analyzes and visualizes worldwide fashion trends.

Cultural Vocal Bursts Intensity Prediction Time Series +1

Paper
Code

Motion Representation with Acceleration Images

no code implementations • 30 Aug 2016 • Hirokatsu Kataoka, Yun He, Soma Shirakabe, Yutaka Satoh

Information of time differentiation is extremely important cue for a motion representation.

Optical Flow Estimation

Paper
Add Code

Human Action Recognition without Human

no code implementations • 29 Aug 2016 • Yun He, Soma Shirakabe, Yutaka Satoh, Hirokatsu Kataoka

The objective of this paper is to evaluate "human action recognition without human".

Action Analysis Action Recognition +1

Paper
Add Code

cvpaper.challenge in 2015 - A review of CVPR2015 and DeepSurvey

no code implementations • 26 May 2016 • Hirokatsu Kataoka, Yudai Miyashita, Tomoaki Yamabe, Soma Shirakabe, Shin'ichi Sato, Hironori Hoshino, Ryo Kato, Kaori Abe, Takaaki Imanari, Naomichi Kobayashi, Shinichiro Morita, Akio Nakamura

The "cvpaper. challenge" is a group composed of members from AIST, Tokyo Denki Univ.

Paper
Add Code

Dominant Codewords Selection with Topic Model for Action Recognition

no code implementations • 1 May 2016 • Hirokatsu Kataoka, Masaki Hayashi, Kenji Iwata, Yutaka Satoh, Yoshimitsu Aoki, Slobodan Ilic

Latent Dirichlet allocation (LDA) is used to develop approximations of human motion primitives; these are mid-level representations, and they adaptively integrate dominant vectors when classifying human activities.

Action Recognition Temporal Action Localization

Paper
Add Code

Semantic Change Detection with Hypermaps

no code implementations • 26 Apr 2016 • Teppei Suzuki, Soma Shirakabe, Yudai Miyashita, Akio Nakamura, Yutaka Satoh, Hirokatsu Kataoka

By the detected change areas, however, a human cannot understand how different the two images.

Change Detection Semantic Segmentation

Paper
Add Code

Feature Evaluation of Deep Convolutional Neural Networks for Object Recognition and Detection

no code implementations • 25 Sep 2015 • Hirokatsu Kataoka, Kenji Iwata, Yutaka Satoh

In this paper, we evaluate convolutional neural network (CNN) features using the AlexNet architecture and very deep convolutional network (VGGNet) architecture.

General Classification Object Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.