ODDS (Outlier Detection DataSets (ODDS))

Outliers or anomalies are instances that do not conform to the norm of a dataset. Outlier detection is an important data mining problem that has been researched within diverse research areas and applications domains such as intrusion detection, fraud detection, unusual event detection, disease condition detection etc.

The exact notion of an outlier is different for different application domains. Hence, applying a technique developed for one domain to another is not straightforward. Moreover, availability of labeled data for training/validation of outlier detection methods is scarce and often noise contained in data tends to be similar to outliers, thus makes it difficult to distinguish them. Because of these challenges outlier detection is not an easy problem to solve. Furthermore, research on outlier detection has been held back by the lack of good benchmark datasets with ground truths. Existing benchmarks are typically either proprietary or else very artificial. Moreover, existing real-world outlier/anomaly detection datasets lack the availability of ground truth.

In ODDS, we openly provide access to a large collection of outlier detection datasets with ground truth (if available). Our focus is to provide datasets from different domains and present them under a single platform for the research community. As such, we arrange the datasets based on their types into different tables in ODDS library.

The ODDS library is being actively developed since summer 2016 and is growing as a result of our research pursuits in outlier/anomaly mining and also to help the corresponding research community. Researchers are welcome to share their datasets with us to include in ODDS library by emailing srayana@cs.stonybrook.edu.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages