Process / pipelineBioinformatics / omics

Machine Learning-Assisted Sequence Alignment

Machine learning-assisted sequence alignment uses statistical learning models — including deep neural networks and protein language models — to compute biologically meaningful alignments between nucleotide or amino acid sequences. By learning substitution patterns and structural constraints from large training corpora, these methods surpass classical scoring matrices (e.g., BLOSUM, PAM) in sensitivity for remote homologs and structurally constrained regions, making them the current state of the art for difficult alignment tasks in genomics and proteomics.

Open in MethodMindSoonVideoSoon

Read the full method

Members only

Sign in with a free account to read this section.

Sign in

Sources

  1. Llinares-López, F., Berthet, Q., Blondel, M., Teboul, O., & Vert, J.-P. (2023). Deep embedding and alignment of protein sequences. Nature Methods, 20(1), 104–111. DOI: 10.1038/s41592-022-01700-2
  2. Jumper, J., Evans, R., Pritzel, A., et al. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583–589. DOI: 10.1038/s41586-021-03819-2

Related methods

ScholarGateMachine learning-assisted sequence alignment (Machine Learning-Assisted Sequence Alignment). Retrieved 2026-06-04 from https://scholargate.app/en/bioinformatics/machine-learning-assisted-sequence-alignment