Kitsune Network Attack Dataset

Introduced by Mirsky et al. in Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection

Kitsune Network Attack Dataset

This is a collection of nine network attack datasets captured from a either an IP-based commercial surveillance system or a network full of IoT devices. Each dataset contains millions of network packets and diffrent cyber attack within it.

For each attack, you are supplied with:

  • A preprocessed dataset in csv format (ready for machine learning)
  • The corresponding label vector in csv format
  • The original network capture in pcap format (in case you want to engineer your own features)

We will now describe in detail what's in these datasets and how they were collected.

The Network Attacks

We have collected a wide variety of attacks which you would find in a real network intrusion. The following is a list of the cyber attack datasets avalaible:

image

For more details on the attacks themselves, please refer to our NDSS paper (citation below).

The Data Collection The following figure presents the network topologies which we used to collect the data, and the corrisponding attack vectors at which the attacks were performed. The network capture took place at point 1 and point X at the router (where a network intrusion detection system could feasibly be placed). For each dataset, clean network traffic was captured for the first 1 million packets, then the cyber attack was performed.

The Dataset Format

Each preprocessed dataset csv has m rows (packets) and 115 columns (features) with no header. The 115 features were extracted using our AfterImage feature extractor, described in our NDSS paper (see below) and available in Python here. In summary, the 115 features provide a statistical snapshot of the network (hosts and behaviors) in the context of the current packet traversing the network. The AfterImage feature extractor is unique in that it can efficiently process millions of streams (network channels) in real-time, incrementally, making it suitable for handling network traffic.

Citation If you use these datasets, please cite:

@inproceedings{mirsky2018kitsune, title={Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection}, author={Mirsky, Yisroel and Doitshman, Tomer and Elovici, Yuval and Shabtai, Asaf}, booktitle={The Network and Distributed System Security Symposium (NDSS) 2018}, year={2018} }

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • GNU Affero General Public License

Modalities


Languages