DISL (Fueling Research with A Large Dataset of Solidity Smart Contracts)

Introduced by Morello et al. in DISL: Fueling Research with A Large Dataset of Solidity Smart Contracts

DISL

The full dataset report is available at: https://arxiv.org/abs/2403.16861

The DISL dataset features a collection of 514, 506 unique Solidity files that have been deployed to Ethereum mainnet. It caters to the need for a large and diverse dataset of real-world smart contracts. DISL serves as a resource for developing machine learning systems and for benchmarking software engineering tools designed for smart contracts.

  • Curated by: Gabriele Morello
  • License: [MIT]

Instructions to explore the dataset

from datasets import load_dataset

# Load the raw dataset
dataset = load_dataset("ASSERT-KTH/DISL", "raw")

# OR

# Load the decomposed dataset
dataset = load_dataset("ASSERT-KTH/DISL", "decomposed")

# number of rows and columns
num_rows = len(dataset["train"])
num_columns = len(dataset["train"].column_names)

# random row
import random
random_row = random.choice(dataset["train"])

# random source code
random_sc = random.choice(dataset["train"])['source_code']
print(random_sc)

Papers


Paper Code Results Date Stars

Dataset Loaders


Tasks


License


Modalities


Languages