Molecule Captioning

12 papers with code • 1 benchmarks • 1 datasets

Molecular description generation entails the creation of a detailed textual depiction illuminating the structure, properties, biological activity, and applications of a molecule based on its molecular descriptors. It furnishes chemists and biologists with a swift conduit to essential molecular information, thus efficiently guiding their research and experiments.

Datasets


Most implemented papers

A Molecular Multimodal Foundation Model Associating Molecule Graphs with Natural Language

bingsu12/momu 12 Sep 2022

Although artificial intelligence (AI) has made significant progress in understanding molecules in a wide range of fields, existing models generally acquire the single cognitive ability from the single molecular modality.

MolFM: A Multimodal Molecular Foundation Model

biofm/openbiomed 6 Jun 2023

In this study, we introduce MolFM, a multimodal molecular foundation model designed to facilitate joint representation learning from molecular structures, biomedical texts, and knowledge graphs.

Translation between Molecules and Natural Language

blender-nlp/MolT5 25 Apr 2022

We present $\textbf{MolT5}$ $-$ a self-supervised learning framework for pretraining models on a vast amount of unlabeled natural language text and molecule strings.

Unifying Molecular and Textual Representations via Multi-task Language Modelling

gt4sd/multitask_text_and_chemistry_t5 29 Jan 2023

Here, we propose the first multi-domain, multi-task language model that can solve a wide range of tasks in both the chemical and natural language domains.

Empowering Molecule Discovery for Molecule-Caption Translation with Large Language Models: A ChatGPT Perspective

phenixace/molregpt 11 Jun 2023

In this work, we propose a novel LLM-based framework (MolReGPT) for molecule-caption translation, where an In-Context Few-Shot Molecule Learning paradigm is introduced to empower molecule discovery with LLMs like ChatGPT to perform their in-context learning capability without domain-specific pre-training and fine-tuning.

GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text

ai-hpc-research-team/git-mol 14 Aug 2023

Large language models have made significant strides in natural language processing, enabling innovative applications in molecular science by processing textual representations of molecules.

From Artificially Real to Real: Leveraging Pseudo Data from Large Language Models for Low-Resource Molecule Discovery

SCIR-HI/ArtificiallyR2R 11 Sep 2023

Furthermore, our method shows a sustained improvement as the volume of pseudo data increases, revealing the great potential of pseudo data in advancing low-resource cross-modal molecule discovery.

BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations

QizhiPei/BioT5 11 Oct 2023

Recent advancements in biological research leverage the integration of molecules, proteins, and natural language to enhance drug discovery.

MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter

acharkq/molca 19 Oct 2023

MolCA enables an LM (e. g., Galactica) to understand both text- and graph-based molecular contents via the cross-modal projector.

InstructMol: Multi-Modal Integration for Building a Versatile and Reliable Molecular Assistant in Drug Discovery

idea-xl/instructmol 27 Nov 2023

The rapid evolution of artificial intelligence in drug discovery encounters challenges with generalization and extensive training, yet Large Language Models (LLMs) offer promise in reshaping interactions with complex molecular data.