TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Drug Discovery	Tox21	SSVAE with multiple SMILES	AUC	0.871	# 3
Molecular Graph Generation	ZINC	All SMILES VAE	Validty	98.5	# 9
Molecular Graph Generation	ZINC	All SMILES VAE	QED Top-3	0.948, 0.948, 0.948	# 1
Molecular Graph Generation	ZINC	All SMILES VAE	PlogP Top-3	29.80, 29.76, 29.11	# 1
Molecular Graph Generation	ZINC	All SMILES VAE	function evaluations	250500	# 14
Molecular Graph Generation	ZINC	All SMILES VAE	Uniqueness	100	# 1
Molecular Graph Generation	ZINC	All SMILES VAE	Novelty	99.96	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/all-smiles-vae/molecular-graph-generation-on-zinc)](https://paperswithcode.com/sota/molecular-graph-generation-on-zinc?p=all-smiles-vae)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/all-smiles-vae/drug-discovery-on-tox21)](https://paperswithcode.com/sota/drug-discovery-on-tox21?p=all-smiles-vae)`

All SMILES Variational Autoencoder

30 May 2019 · Zaccary Alperstein, Artem Cherkasov, Jason Tyler Rolfe ·

Variational autoencoders (VAEs) defined over SMILES string and graph-based representations of molecules promise to improve the optimization of molecular properties, thereby revolutionizing the pharmaceuticals and materials industries. However, these VAEs are hindered by the non-unique nature of SMILES strings and the computational cost of graph convolutions. To efficiently pass messages along all paths through the molecular graph, we encode multiple SMILES strings of a single molecule using a set of stacked recurrent neural networks, pooling hidden representations of each atom between SMILES representations, and use attentional pooling to build a final fixed-length latent representation. By then decoding to a disjoint set of SMILES strings of the molecule, our All SMILES VAE learns an almost bijective mapping between molecules and latent representations near the high-probability-mass subspace of the prior. Our SMILES-derived but molecule-based latent representations significantly surpass the state-of-the-art in a variety of fully- and semi-supervised property regression and molecular property optimization tasks.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Drug Discovery

Molecular Graph Generation

Datasets

ZINC

Tox21

Results from the Paper

Add Remove

Ranked #1 on Molecular Graph Generation on ZINC (QED Top-3 metric)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Drug Discovery	Tox21	SSVAE with multiple SMILES	AUC	0.871	# 3	Compare
Molecular Graph Generation	ZINC	All SMILES VAE	Validty	98.5	# 9	Compare
			QED Top-3	0.948, 0.948, 0.948	# 1	Compare
			PlogP Top-3	29.80, 29.76, 29.11	# 1	Compare
			function evaluations	250500	# 14	Compare
			Uniqueness	100	# 1	Compare
			Novelty	99.96	# 3	Compare

Methods

Add Remove

VAE

Edit Social Preview

All SMILES Variational Autoencoder

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove