Classifying all cells in an organ is a relevant and difficult problem from plant developmental biology. We here abstract the problem into a new benchmark for node classification in a geo-referenced graph. Solving it requires learning the spatial layout of the organ including symmetries. To allow the convenient testing of new geometrical learning methods, the benchmark of Arabidopsis thaliana ovules is made available as a PyTorch data loader, along with a large number of precomputed features.
1 PAPER • 1 BENCHMARK
The dataset contains constructed multi-modal features (visual and textual), pseudo-labels (on heritage values and attributes), and graph structures (with temporal, social, and spatial links) constructed using User-Generated Content data collected from Flickr social media platform in three global cities containing UNESCO World Heritage property (Amsterdam, Suzhou, Venice). The motivation of data collection in this project is to provide datasets that could be both directly applicable for ML communities as test-bed, and theoretically informative for heritage and urban scholars to draw conclusions on for planning decision-making.
1 PAPER • NO BENCHMARKS YET
This is the large version of the MuMiN dataset.
This is the medium version of the MuMiN dataset.
This is the small version of the MuMiN dataset.
This benchmark hypergraph dataset, Twitter-HyDrug-UR, is derived from Twitter-HyDrug by HyGCL-DC. Twitter-HyDrug-UR is a real-world hypergraph data that describes the drug trafficking on Twitter. Unlike HyGCL-DC, which targets a drug trafficking community detection task (a multi-label node classification), we aim to identify drug user roles in drug trafficking activities on social media. To this end, we categorize node labels into four distinct roles: drug seller, drug buyer, drug user, and drug discussant, and each node is assigned to one and only one label. Consequently, we frame the problem for Twitter-HyDrug-UR as a multi-class node classification task.