Multiple Sequence Alignment

18 papers with code • 3 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

Algorithms and Complexity on Indexing Founder Graphs

algbio/founderblockgraphs 25 Feb 2021

We study the problem of matching a string in a labeled graph.

MNHN-Tree-Tools: a toolbox for tree inference using multi-scale clustering of a set of sequences

haschka/mnhn-tree-tools Bioinformatics 2021

We introduce MNHN-Tree-Tools, a high performance set of algorithms that builds multi-scale, nested clusters of sequences found in a FASTA file.

Neural Distance Embeddings for Biological Sequences

gcorso/neuroseed NeurIPS 2021

The development of data-dependent heuristics and representations for biological sequences that reflect their evolutionary distance is critical for large-scale biological research.

Exploring evolution-aware & -free protein language models as protein function predictors

elttaes/revisiting-plms 14 Jun 2022

Large-scale Protein Language Models (PLMs) have improved performance in protein prediction tasks, ranging from 3D structure prediction to various function predictions.

ApHMM: Accelerating Profile Hidden Markov Models for Fast and Energy-Efficient Genome Analysis

cmu-safari/aphmm-gpu 20 Jul 2022

We identify an urgent need for a flexible, high-performance, and energy-efficient HW/SW co-design to address the major inefficiencies in the Baum-Welch algorithm for pHMMs.

Convolutional ProteinUnetLM competitive with long short-term memory-based protein secondary structure predictors

Kotrix/ProteinUnetLM Proteins 2022

In recent years, a new generation of algorithms for SS prediction based on embeddings from protein language models (pLMs) is emerging.

PoET: A generative model of protein families as sequences-of-sequences

OpenProteinAI/PoET NeurIPS 2023

Generative protein language models are a natural way to design new proteins with desired functions.

PEvoLM: Protein Sequence Evolutionary Information Language Model

issararab/pevolm 16 Aug 2023

With the exponential increase of the protein sequence databases over time, multiple-sequence alignment (MSA) methods, like PSI-BLAST, perform exhaustive and time-consuming database search to retrieve evolutionary information.