Bio-decagon is a dataset for polypharmacy side effect identification problem framed as a multirelational link prediction problem in a two-layer multimodal graph/network of two node types: drugs and proteins. Protein-protein interaction network describes relationships between proteins. Drug-drug interaction network contains 964 different types of edges (one for each side effect type) and describes which drug pairs lead to which side effects. Lastly, drug-protein links describe the proteins targeted by a given drug.
The final network after linking entity vocabularies used by different databases has 645 drug and 19,085 protein nodes connected by 715,612 protein-protein, 4,651,131 drug-drug, and 18,596 drug-protein edges.Source: Modeling polypharmacy side effects with graph convolutional networks