Stack of 2D gray images of glass fiber-reinforced polyamide 66 (GF-PA66) 3D X-ray Computed Tomography (XCT) specimen.
1 PAPER • NO BENCHMARKS YET
Details about the creation of the dataset can be seen in https://arxiv.org/abs/2110.06139.
Gait3D-Parsing is a dataset for gait recognition in the wild. It is an extension of the large-scale and challenging Gait-3D dataset which is collected from an in-the-wild environment. The train set has 3,000 IDs, and the test set has 1,000 IDs. Meanwhile, 1,000 sequences in the test set are taken as the query set, and the rest of the test set is taken as the gallery set.
Habitat-Matterport 3D Semantics Dataset (HM3D-Semantics v0.1) is the largest-ever dataset of semantically-annotated 3D indoor spaces. It contains dense semantic annotations for 120 high-resolution 3D scenes from the Habitat-Matterport 3D dataset. The HM3D scenes are annotated with the 1700+ raw object names, which are mapped to 40 Matterport categories. On average, each scene in HM3D-Semantics v0.1 consists of 646 objects from 114 categories.
HPointLoc is a dataset designed for exploring capabilities of visual place recognition in indoor environment and loop detection in simultaneous localization and mapping. It is based on the popular Habitat simulator from 49 photorealistic indoor scenes from the Matterport3D dataset and contains 76,000 frames.
The dataset comprises 2886 patches in total (2 m GSD), of which 1732 patches for training and 1154 patches for testing. The patch size varies (depending on agricultural parcels) and is on average around 60x60 pixels. Each patch contains 150 contiguous hyperspectral bands (462-942 nm, with a spectral resolution of 3.2 nm), which reflects the spectral range of the hyperspectral imaging sensor deployed on-board Intuition-1.
1 PAPER • 1 BENCHMARK
High Quality Indoor Monocular Depth Estimation Dataset with focus on performance variation across space type
Involves data where a robot interacts with 5.1 cm colored blocks to complete an order-fulfillment style block stacking task. It contains dynamic scenes and real time-series data in a less constrained environment than comparable datasets. There are nearly 12,000 stacking attempts and over 2 million frames of real data.
L-CAS 3D Point Cloud People Dataset contains 28,002 Velodyne scan frames acquired in one of the main buildings (Minerva Building) of the University of Lincoln, UK. Total length of the recorded data is about 49 minutes. Data were grouped into two classes according to whether the robot was stationary or moving.
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
LiPC (LiDAR Point Cloud Clustering Benchmark Suite) is a benchmark suite for point cloud clustering algorithms based on open-source software and open datasets. It aims to provide the community with a collection of methods and datasets that are easy to use, comparable, and that experimental results are traceable and reproducible.
Multi-Person Interaction Motion (MI-Motion) Dataset includes skeleton sequences of multiple individuals collected by motion capture systems and refined and synthesized using a game engine. The dataset contains 167k frames of interacting people's skeleton poses and is categorized into 5 different activity scenes.
1、 Competition name:
MTNeuro is a multi-task neuroimaging benchmark built on volumetric, micrometer-resolution X-ray microtomography images spanning a large thalamocortical section of mouse brain, encompassing multiple cortical and subcortical regions.
Minecraft Segmentation is a segmentation dataset for the Minecraft House that adds semantic segmentation labels for sub-components of the house. There are 2050 houses in total and 1038 distinct labels of subcomponents.
Mouse Brain MRI atlas (both in-vivo and ex-vivo) (repository relocated from the original webpage)
The NCANDA consortium is composed of an Administrative component at the University of California San Diego, a Data Analysis and Informatics component at SRI International, and five research sites (University of California San Diego, SRI International, Duke University, the University of Pittsburgh, and the Oregon Health & Science University). A sample of 831 individuals (ages 12-21) were recruited for the study across the five research sites. The enrolled participants are followed in an accelerated longitudinal design that involves structural and functional imaging of the brain along with extensive neuropsychological and clinical assessments.
Neural fields (NeFs) have recently emerged as a versatile method for modeling signals of various modalities, including images, shapes, and scenes. Subsequently, many works have explored the use of NeFs as representations for downstream tasks, e.g. classifying an image based on the parameters of a NeF that has been fit to it. However, the impact of the NeF hyperparameters on their quality as downstream representation is scarcely understood and remains largely unexplored. This is partly caused by the large amount of time required to fit datasets of neural fields.
OCTScenes contains 5000 tabletop scenes with a total of 15 everyday objects. Each scene is captured in 60 frames covering a 360-degree perspective.
PaintNet is a dataset for learning robotic spray painting of free-form 3D objects. PaintNet includes more than 800 object meshes and the associated painting strokes collected in a real industrial setting.
The PointDenoisingBenchmark dataset features 28 different shapes, split into 18 training shapes and 10 test shapes.
PolyU-BPCoMa: A Dataset and Benchmark Towards Mobile Colorized Mapping Using a Backpack Multisensorial System
PoseScript is a dataset that pairs a few thousand 3D human poses from AMASS with rich human-annotated descriptions of the body parts and their spatial relationships. This dataset is designed for the retrieval of relevant poses from large-scale datasets and synthetic pose generation, both based on a textual pose description.
The Robot Tracking Benchmark (RTB) is a synthetic dataset that facilitates the quantitative evaluation of 3D tracking algorithms for multi-body objects. It was created using the procedural rendering pipeline BlenderProc. The dataset contains photo-realistic sequences with HDRi lighting and physically-based materials. Perfect ground truth annotations for camera and robot trajectories are provided in the BOP format. Many physical effects, such as motion blur, rolling shutter, and camera shaking, are accurately modeled to reflect real-world conditions. For each frame, four depth qualities exist to simulate sensors with different characteristics. While the first quality provides perfect ground truth, the second considers measurements with the distance-dependent noise characteristics of the Azure Kinect time-of-flight sensor. Finally, for the third and fourth quality, two stereo RGB images with and without a pattern from a simulated dot projector were rendered. Depth images were then recons
A total of 227 cross sectional images (20 x 54 mm with a resolution of 289 x 648 pixels) of hind-leg xenograft tumors from 29 mice were obtained with 1mm step-wise movement of the array mounted on a manual positioning device. The whole tumor volume was acquired using a diagnostic ultrasound system with a 10 MHz linear transducer and 50 MHz sampling.
ReplicaGrasp dataset is created by spawning objects from GRAB into the ReplicaCAD scenes, simulated in random positions and orientations using the Habitat simulator. We capture 4,800 instances, with 50 different objects spawned in one of 48 receptacles in both, upright and randomly fallen orientations.
Robot@Home2, is an enhanced version aimed at improving usability and functionality for developing and testing mobile robotics and computer vision algorithms. Robot@Home2 consists of three main components. Firstly, a relational database that states the contextual information and data links, compatible with Standard Query Language. Secondly,a Python package for managing the database, including downloading, querying, and interfacing functions. Finally, learning resources in the form of Jupyter notebooks, runnable locally or on the Google Colab platform, enabling users to explore the dataset without local installations. These freely available tools are expected to enhance the ease of exploiting the Robot@Home dataset and accelerate research in computer vision and robotics.
Sara motion is a 3D motion dataset, named Synthetic Actors and Real Actions (SARA), for training a model to produce motion embeddings suitable for reasoning about motion similarity.
A Benchmark Dataset for Deep Learning-based Methods for 3D Topology Optimization.
Data for Score-Based Generative Models for PET Image Reconstruction. All simuations based on BrainWeb dataset. The image simulation either taken from Georg Schramm's BrainWeb simulation in 2D, or in 3D it was simulated using BrainWeb package. The 2D measurements were simulated using pyParallelProj and 3D measurements using SIRF (with STIR backend).
SIDOD is a new, publicly-available image dataset generated by the NVIDIA Deep Learning Data Synthesizer intended for use in object detection, pose estimation, and tracking applications. This dataset contains 144k stereo image pairs that synthetically combine 18 camera viewpoints of three photorealistic virtual environments with up to 10 objects (chosen randomly from the 21 object models of the YCB dataset) and flying distractors.
Scan Entities in 3D (ScanEnts3D) is a large-scale dataset which provides explicit correspondences between 369k objects across 84k natural referentural sentences, covering 705 real-world scenes.
Homepage | GitHub
This dataset accompanies the linked SerialTrack paper and provides test case data (2D/3D, varying particle density) across a range of synthetic and experimental imaging modalities. Included test cases can be used for further code development, validation of and comparisons for existing particle tracking codes, and/or evaluating and learning to use our SerialTrack code on known data.
The ShapeNet-Skeleton dataset has ground-truth skeleton point sets and skeletal volumes for object instances in the ShapeNet dataset.
We present the first fine-grained dataset of 1,497 3D VR sketch and 3D shape pairs for 1,005 chair shapes with large shapes diversity from the ShapeNetCore dataset from 50 participants.
Dataset built from partial reconstructions of real-world indoor scenes using RGB-D sequences from ScanNet, aimed at estimating the unknown position of an object (e.g. where is the bag?) given a partial 3D scan of a scene. The dataset mostly consists of bedrooms, bathrooms, and living rooms. Some room types like closet and gym only have a few instances.
3D confocal stacks with corresponding 2D Light-field microscope images
The Store Dataset is a dataset for estimating 3D poses of multiple humans in real-time. It is captured inside two kinds of simulated stores with 12 and 28 cameras, respectively.
Super-CLEVR-3D is a visual question answering (VQA) dataset where the questions are about the explicit 3D configuration of the objects from images (i.e. 3D poses, parts, and occlusion). It consists of objects from 5 categories: aeroplanes, buses, bicycles, cars and motorbikes. The rendered objects are from CGParts dataset, with the same setting as Super-CLEVR dataset.
Synthetic dataset comprising three different environments for multi-camera dynamic novel view synthesis for soccer. This dataset is made compatible for Nerfstudio, and includes data parsers with various settings to reproduce the settings of our paper "Dynamic NeRFs for Soccer Scenes" and more.
The ARPA-E funded TERRA-REF project is generating open-access reference datasets for the study of plant sensing, genomics, and phenomics. Sensor data were generated by a field scanner sensing platform that captures color, thermal, hyperspectral, and active flourescence imagery as well as three dimensional structure and associated environmental measurements. This dataset is provided alongside data collected using traditional field methods in order to support calibration and validation of algorithms used to extract plot level phenotypes from these datasets.
Dataset of paired thermal and RGB images comprising ten diverse scenes—six indoor and four outdoor scenes— for 3D scene reconstruction and novel view synthesis (e.g. with NeRF).
The increasing use of deep learning techniques has reduced interpretation time and, ideally, reduced interpreter bias by automatically deriving geological maps from digital outcrop models. However, accurate validation of these automated mapping approaches is a significant challenge due to the subjective nature of geological mapping and the difficulty in collecting quantitative validation data. Additionally, many state-of-the-art deep learning methods are limited to 2D image data, which is insufficient for 3D digital outcrops, such as hyperclouds. To address these challenges, we present Tinto, a multi-sensor benchmark digital outcrop dataset designed to facilitate the development and validation of deep learning approaches for geological mapping, especially for non-structured 3D data like point clouds. Tinto comprises two complementary sets: 1) a real digital outcrop model from Corta Atalaya (Spain), with spectral attributes and ground-truth data, and 2) a synthetic twin that uses latent
A dataset made of 3D image data and their embeddings to test TomoSAM
UAV Laser Scanning data collected over neotropical forest (Paracou French Guiana). Four flights conducted over one ha plot in 2021 and 2022.
This dataset presents a vision and perception research dataset collected in Rome, featuring RGB data, 3D point clouds, IMU, and GPS data. We introduce a new benchmark targeting visual odometry and SLAM, to advance the research in autonomous robotics and computer vision. This work complements existing datasets by simultaneously addressing several issues, such as environment diversity, motion patterns, and sensor frequency. It uses up-to-date devices and presents effective procedures to accurately calibrate the intrinsic and extrinsic of the sensors while addressing temporal synchronization. During recording, we cover multi-floor buildings, gardens, urban and highway scenarios. Combining handheld and car-based data collections, our setup can simulate any robot (quadrupeds, quadrotors, autonomous vehicles). The dataset includes an accurate 6-dof ground truth based on a novel methodology that refines the RTK-GPS estimate with LiDAR point clouds through Bundle Adjustment. All sequences divi
From https://github.com/MMintLab/VIRDO/blob/master/data/dataset_readme.txt,