Many biological questions, including the estimation of deep evolutionary histories and the detection of remote homology between protein sequences, rely upon multiple sequence alignments (MSAs) and phylogenetic trees of large datasets.
Unsupervised protein language models trained across millions of diverse sequences learn structure and function of proteins.
Our dataset consists of amino acid sequences, Q8 secondary structures, position specific scoring matrices, multiple sequence alignment co-evolutionary features, backbone atom distance matrices, torsion angles, and 3D coordinates.
Align-RUDDER outperforms competitors on complex artificial tasks with delayed reward and few demonstrations.
GENERAL REINFORCEMENT LEARNING MINECRAFT MULTIPLE SEQUENCE ALIGNMENT SAFE EXPLORATION
This paper proposed a novel and straightforward approach to improve the accuracy of progressive multiple protein sequence alignment method.
Ranked #1 on
Multiple Sequence Alignment
on OXBench
DECISION MAKING MULTIPLE SEQUENCE ALIGNMENT PROTEIN SECONDARY STRUCTURE PREDICTION
Nepal, the pairwise profile aligner with the novel scoring function significantly improved both alignment sensitivity and precision, compared to aligners with the existing functions.