Multiple Sequence Alignment
18 papers with code • 3 benchmarks • 0 datasets
Most implemented papers
Algorithms and Complexity on Indexing Founder Graphs
We study the problem of matching a string in a labeled graph.
MNHN-Tree-Tools: a toolbox for tree inference using multi-scale clustering of a set of sequences
We introduce MNHN-Tree-Tools, a high performance set of algorithms that builds multi-scale, nested clusters of sequences found in a FASTA file.
Neural Distance Embeddings for Biological Sequences
The development of data-dependent heuristics and representations for biological sequences that reflect their evolutionary distance is critical for large-scale biological research.
Exploring evolution-aware & -free protein language models as protein function predictors
Large-scale Protein Language Models (PLMs) have improved performance in protein prediction tasks, ranging from 3D structure prediction to various function predictions.
ApHMM: Accelerating Profile Hidden Markov Models for Fast and Energy-Efficient Genome Analysis
We identify an urgent need for a flexible, high-performance, and energy-efficient HW/SW co-design to address the major inefficiencies in the Baum-Welch algorithm for pHMMs.
Convolutional ProteinUnetLM competitive with long short-term memory-based protein secondary structure predictors
In recent years, a new generation of algorithms for SS prediction based on embeddings from protein language models (pLMs) is emerging.
PoET: A generative model of protein families as sequences-of-sequences
Generative protein language models are a natural way to design new proteins with desired functions.
PEvoLM: Protein Sequence Evolutionary Information Language Model
With the exponential increase of the protein sequence databases over time, multiple-sequence alignment (MSA) methods, like PSI-BLAST, perform exhaustive and time-consuming database search to retrieve evolutionary information.