Protein Language Model
24 papers with code • 1 benchmarks • 1 datasets
Most implemented papers
ESM-NBR: fast and accurate nucleic acid-binding residue prediction via protein language model feature representation and multi-task learning
Meanwhile, the ESM-NBR obtains the MCC values for DNA-binding residues prediction of 0. 427 and 0. 391 on two independent test sets, which are 18. 61 and 10. 45% higher than those of the second-best methods, respectively.
MSA Transformer
Unsupervised protein language models trained across millions of diverse sequences learn structure and function of proteins.
ECRECer: Enzyme Commission Number Recommendation and Benchmarking based on Multiagent Dual-core Learning
Take UniPort protein "A0A0U5GJ41" as an example (1. 14.-.-), ECRECer annotated it with "1. 14. 11. 38", which supported by further protein structure analysis based on AlphaFold2.
Structure-aware Protein Self-supervised Learning
Furthermore, we propose to leverage the available protein language model pretrained on protein sequences to enhance the self-supervised learning.
Generative power of a protein language model trained on multiple sequence alignments
Moreover, for small protein families, our generation method based on MSA Transformer outperforms Potts models.
DistilProtBert: A distilled protein language model used to distinguish between real proteins and their randomly shuffled counterparts
Here, we adapted this concept to the problem of protein sequence analysis, by developing DistilProtBert, a distilled version of the successful ProtBert model.
HelixFold-Single: MSA-free Protein Structure Prediction by Using Protein Language Model as an Alternative
Our proposed method, HelixFold-Single, first pre-trains a large-scale protein language model (PLM) with thousands of millions of primary sequences utilizing the self-supervised learning paradigm, which will be used as an alternative to MSAs for learning the co-evolution information.
Protein language model rescue mutations highlight variant effects and structure in clinically relevant genes
Despite being self-supervised, protein language models have shown remarkable performance in fundamental biological tasks such as predicting impact of genetic variation on protein structure and function.
Protein Language Models and Structure Prediction: Connection and Progression
The prediction of protein structures from sequences is an important task for function prediction, drug design, and related biological processes understanding.
Plug & Play Directed Evolution of Proteins with Gradient-based Discrete MCMC
We introduce a sampling framework for evolving proteins in silico that supports mixing and matching a variety of unsupervised models, such as protein language models, and supervised models that predict protein function from sequence.