Protein Language Model
27 papers with code • 1 benchmarks • 1 datasets
Most implemented papers
Ankh: Optimized Protein Language Model Unlocks General-Purpose Modelling
As opposed to scaling-up protein language models (PLMs), we seek improving performance via protein-specific optimization.
LEP-AD: Language Embedding of Proteins and Attention to Drugs predicts drug target interactions
The LEP-AD model scales favorably in performance with the size of training data.
Vaxformer: Antigenicity-controlled Transformer for Vaccine Design Against SARS-CoV-2
The SARS-CoV-2 pandemic has emphasised the importance of developing a universal vaccine that can protect against current and future variants of the virus.
Protein-DNA binding sites prediction based on pre-trained protein language model and contrastive learning
We trained the CLAPE-DB model on the protein-DNA binding sites dataset and evaluated the model performance and generalization ability through various experiments.
ProtiGeno: a prokaryotic short gene finder using protein language models
Prokaryotic gene prediction plays an important role in understanding the biology of organisms and their function with applications in medicine and biotechnology.
Pairing interacting protein sequences using masked language modeling
We introduce a method called DiffPALM that solves it by exploiting the ability of MSA Transformer to fill in masked amino acids in multiple sequence alignments using the surrounding context.
Improving antibody language models with native pairing
Current antibody language models are limited by their use of unpaired antibody sequence data and the biases in publicly available antibody sequence datasets, which are skewed toward antibodies against a relatively small number of pathogens.
PeptideBERT: A Language Model based on Transformers for Peptide Property Prediction
In this work, inspired by recent progress in Large Language Models (LLMs), we introduce PeptideBERT, a protein language model for predicting three key properties of peptides (hemolysis, solubility, and non-fouling).
pLMFPPred: a novel approach for accurate prediction of functional peptides integrating embedding from pre-trained protein language model and imbalanced learning
Comparative experiments show that pLMFPPred outperforms current methods for predicting functional peptides. The experimental results suggest that the proposed method (pLMFPPred) can provide better performance in terms of Accuracy, Area under the curve - Receiver Operating Characteristics, and F1-Score than existing methods.
Cognate Transformer for Automated Phonological Reconstruction and Cognate Reflex Prediction
Phonological reconstruction is one of the central problems in historical linguistics where a proto-word of an ancestral language is determined from the observed cognate words of daughter languages.