This dataset is composed of 7,753 pairs of whole slide images and their corresponding diagnostic reports, extracted from the TCGA platform and refined with large language models. This dataset aims to boost the field of automated histopathology report generation by providing a new publicly available evaluation benchmark. See HistGen paper (see https://arxiv.org/pdf/2403.05396.pdf for reference) for a more detailed description of this dataset.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Modalities


Languages