Recorded with a Husky A200 wheeled UGV, the Vulpi 2021 dataset contains 13 min of Inertial Measurement Unit (IMU), motor current, and wheel odometry data, focusing on agricultural terrains. The dataset includes experiments on concrete, a dirt road, a ploughed terrain and an unploughed terrain that were all recorded on an experimental farm in San Cassiano, Lecce, Italy.
1 PAPER • NO BENCHMARKS YET
The dataset is a private dataset collected for automatic analysis of psychological distress. It contains self-reported distress labels provided by human volunteers. The dataset consists of 30-min interview recordings of participants.
1 PAPER • 2 BENCHMARKS
This file contains the data and code for the publication "The Federal Reserve's Response to the Global Financial Crisis and Its Long-Term Impact: An Interrupted Time-Series Natural Experimental Analysis" by A. C. Kamkoum, 2023.
This dataset provides wireless measurements from two industrial testbeds: iV2V (industrial Vehicle-to-Vehicle) and iV2I+ (industrial Vehicular-to-Infrastructure plus sensor).
The Robo-VLN dataset is a continuous control formulation of the VLN-CE dataset by Krantz et al ported over from Room-to-Room (R2R) dataset created by Anderson et al. The details regarding converting discrete VLN dataset into continuous control formulation can be found in our paper.
1 PAPER • 1 BENCHMARK
The uniD dataset is an innovative collection of naturalistic road user trajectories, captured within the RWTH Aachen University campus using drone technology to address common challenges such as occlusions found in traditional traffic data collection methods. It meticulously documents the movement and classifies each road user by type. Employing cutting-edge computer vision algorithms, the dataset ensures high positional accuracy. Its utility spans various applications, from predicting road user behavior and modeling driver actions to conducting scenario-based safety checks for automated driving systems and facilitating the data-driven design of Highly Automated Driving (HAD) system components.
This dataset contains the ground truth for urban changes occurred in Mariupol, Ukraine for the time frame 2017-2020. This is useful for transferring the urban change monitoring network ERCNN-DRS (https://github.com/It4innovations/ERCNN-DRS_urban_change_monitoring) to that region.
Bach chorales is a univariate time series based on chorales, where the task is to learn generative grammar. The dataset consists of single-line melodies of 100 Bach chorales (originally 4 voices). The melody line can be studied independently of other voices. The grand challenge is to learn a generative grammar for stylistically valid chorales.
0 PAPER • NO BENCHMARKS YET
Objective This study introduces the BlendedICU dataset, a massive dataset of international intensive care data. This dataset aims to facilitate generalizability studies of machine learning models, as well as statistical studies of clinical practices in the intensive care units.
The dataset comprises patches of size 512x512 pixels collected from Sentinel-2 L2A satellite mission. All reported forest fires are located in California. For each area of interest, two images are provided: pre-fire acquisition and post-fire acquisition. Each image is composed of 12 different channels, collecting information from the visible spectrum, infrared and ultrablue.
A dataset with $23\,870$ digital trajectories (i.e. time series) of handwritten lower- and uppercase Latin letters and Arabic numbers ($a$-$z$, $A$-$Z$, $0$-$9$), generated by $77$ experts using a Wacom Pen Tablet. An expert is considered a proficient user of the recorded symbols, in this case adult native German speakers.
FHRMA is an open-source project for Fetal Heart Rate Morphological Analysis containing Matlab source code and datasets. As a sub-project, it includes a deep learning method and dataset for automatic identification of the maternal heart rate (MHR) and, more generally, false signals (FSs) on fetal heart rate (FHR) recordings. The challenge concerns particularly the FHR signal recorded with Doppler sensors, on which MHR interference and other FSs are particularly common, but the dataset also includes FHR recorded with scalp-ECG. The training and validation dataset contained 1030 expert-annotated periods (mean duration: 36 min) from 635 recordings. Labels consist of annotating each time sample as either 1: False signal; 0: True signal, or -1: do not know or irrelevant.
ForeDeCk is a time series database compiled at the National Technical University of Athens that contains 900,000 continuous time series, built from multiple, diverse and publicly accessible sources. ForeDeCk emphasizes business forecasting applications, including series from relevant domains such as industries, services, tourism, imports & exports, demographics, education, labor & wage, government, households, bonds, stocks, insurances, loans, real estate, transportation, and natural resources & environment.
The medaka (Oryzias latipes) and the zebrafish (Danio rerio) are used as a model organism for a variety of subjects in biomedical research. The presented work aims to study the potential of automated ventricular dimension estimation through heart segmentation in medaka. For more on this, it's time for a closer look on our paper and the supplementary materials.
Mudestreda Multimodal Device State Recognition Dataset obtained from real industrial milling device with Time Series and Image Data for Classification, Regression, Anomaly Detection, Remaining Useful Life (RUL) estimation, Signal Drift measurement, Zero Shot Flank Took Wear, and Feature Engineering purposes.
This study’s sample consists of seven corporations (Black Rock, Google, Meta, JP Morgan, Walgreens, Netflix, and Pepsico) analyzed across seven quarters beginning in 2021. The data includes the implied volatility level (annualized) for the day before, the day of, and the day following the earnings report. This information was obtained from the Bloomberg Terminal dataset BVOL. The data we read from the terminal is based on Bloomberg’s algorithm for calculating the implied volatility for different strikes. The value is the same for both calls and puts, which makes comparisons and calculations more straightforward. The dataset contains a mixture of high-growth, high-risk technology corporations that saw strong market tailwinds during the previous year and steady, high-dividend-paying equities. For a more comprehensive conclusion, we analyze the implied volatility levels across three expirations to determine the influence of each expiration. The shortest maturity spans from 1 to 4 days, wh
PJM Hourly Energy Consumption Data PJM Interconnection LLC (PJM) is a regional transmission organization (RTO) in the United States. It is part of the Eastern Interconnection grid operating an electric transmission system serving all or parts of Delaware, Illinois, Indiana, Kentucky, Maryland, Michigan, New Jersey, North Carolina, Ohio, Pennsylvania, Tennessee, Virginia, West Virginia, and the District of Columbia.
SynD is a synthetic energy dataset with a focus on residential buildings. This dataset is the result of a custom simulation process that relies on power traces of household appliances. The output of simulations is the power consumption of 21 household appliances as well as the household-wide consumption (i.e. mains). Therefore, SynD's can be used for Non-Intrusive Load Monitoring, also referred to as Energy Disaggregation.
Human activity recognition and clinical biomechanics are challenging problems in physical telerehabilitation medicine. However, most publicly available datasets on human body movements cannot be used to study both problems in an out-of-the-lab movement acquisition setting. The objective of the VIDIMU dataset is to pave the way towards affordable patient tracking solutions for remote daily life activities recognition and kinematic analysis.
X-Wines is a consistent wine dataset containing 100,646 instances and 21 million real evaluations carried out by users. Data were collected on the open Web in 2022 and pre-processed for wider free use. They refer to the scale 1–5 ratings carried out over a period of 10 years (2012–2021) for wines produced in 62 different countries.
The Tufts fNIRS to Mental Workload (fNIRS2MW) open-access dataset is a new dataset for building machine learning classifiers that can consume a short window (30 seconds) of multivariate fNIRS recordings and predict the mental workload intensity of the user during that window.