Loading…

MICon Contamination Detection Workflow for Next-Generation Sequencing Laboratories Using Microhaplotype Loci and Supervised Learning

Innovation in sequencing instrumentation is increasing the per-batch data volumes and decreasing the per-base costs. Multiplexed chemistry protocols after the addition of index tags have further contributed to efficient and cost-effective sequencer utilization. With these pooled processing strategie...

Full description

Saved in:
Bibliographic Details
Published in:The Journal of molecular diagnostics : JMD 2023-08, Vol.25 (8), p.602-610
Main Authors: Balan, Jagadheshwar, Koganti, Tejaswi, Basu, Shubham, Dina, Michelle A., Artymiuk, Cody J., Barr Fritcher, Emily G., Halverson, Katie E., Wu, Xianglin, Jenkinson, Garrett, Viswanatha, David S.
Format: Article
Language:English
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Innovation in sequencing instrumentation is increasing the per-batch data volumes and decreasing the per-base costs. Multiplexed chemistry protocols after the addition of index tags have further contributed to efficient and cost-effective sequencer utilization. With these pooled processing strategies, however, comes an increased risk of sample contamination. Sample contamination poses a risk of missing critical variants in a patient sample or wrongly reporting variants derived from the contaminant, which are particularly relevant issues in oncology specimen testing in which low variant allele frequencies have clinical relevance. Small custom-targeted next-generation sequencing (NGS) panels yield limited variants and pose challenges in delineating true somatic variants versus contamination calls. A number of popular contamination identification tools have the ability to perform well in whole-genome/exome sequencing data; however, in smaller gene panels, there are fewer variant candidates for the tools to perform accurately. To prevent clinical reporting of potentially contaminated samples in small next-generation sequencing panels, we have developed MICon (Microhaplotype Contamination detection), a novel contamination detection model that uses microhaplotype site variant allele frequencies. In a heterogeneous hold-out test cohort of 210 samples, the model displayed state-of-the-art performance with an area under the receiver-operating characteristic curve of 0.995.
ISSN:1525-1578
1943-7811
DOI:10.1016/j.jmoldx.2023.05.001