PMC-OA (PubmedCentral OpenAcess)

Introduced by Lin et al. in PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents

PMC-OA is a large-scale dataset that contains 1.65M image-text pairs. The figures and captions from PubMed Central, 2,478,267 available papers are covered and 12,211,907 figure-caption pairs are extracted.

Source: PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages