Loading…

ZoomQA: residue-level protein model accuracy estimation with machine learning on sequential and 3D structural features

Abstract Motivation The Estimation of Model Accuracy problem is a cornerstone problem in the field of Bioinformatics. As of CASP14, there are 79 global QA methods, and a minority of 39 residue-level QA methods with very few of them working on protein complexes. Here, we introduce ZoomQA, a novel, si...

Full description

Saved in:

Bibliographic Details
Published in:	Briefings in bioinformatics 2022-01, Vol.23 (1)
Main Authors:	Hippe, Kyle, Lilley, Cade, William Berkenpas, Joshua, Chandana Pocha, Ciri, Kishaba, Kiyomi, Ding, Hui, Hou, Jie, Si, Dong, Cao, Renzhi
Format:	Article
Language:	English
Subjects:	Accuracy Amino acids Bioinformatics Caspases - chemistry Chemical properties COVID-19 Homology Humans Learning algorithms Machine Learning Model accuracy Models, Molecular Problem Solving Protocol Protein structure Protein Structure, Quaternary Protein Structure, Tertiary Proteins Residues SARS-CoV-2 - chemistry Sequence Analysis, Protein Severe acute respiratory syndrome coronavirus 2 Viral Proteins - chemistry
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Abstract Motivation The Estimation of Model Accuracy problem is a cornerstone problem in the field of Bioinformatics. As of CASP14, there are 79 global QA methods, and a minority of 39 residue-level QA methods with very few of them working on protein complexes. Here, we introduce ZoomQA, a novel, single-model method for assessing the accuracy of a tertiary protein structure/complex prediction at residue level, which have many applications such as drug discovery. ZoomQA differs from others by considering the change in chemical and physical features of a fragment structure (a portion of a protein within a radius $r$ of the target amino acid) as the radius of contact increases. Fourteen physical and chemical properties of amino acids are used to build a comprehensive representation of every residue within a protein and grade their placement within the protein as a whole. Moreover, we have shown the potential of ZoomQA to identify problematic regions of the SARS-CoV-2 protein complex. Results We benchmark ZoomQA on CASP14, and it outperforms other state-of-the-art local QA methods and rivals state of the art QA methods in global prediction metrics. Our experiment shows the efficacy of these new features and shows that our method is able to match the performance of other state-of-the-art methods without the use of homology searching against databases or PSSM matrices. Availability http://zoomQA.renzhitech.com
ISSN:	1467-5463 1477-4054
DOI:	10.1093/bib/bbab384