Loading…

Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations

The ready availability of vast amounts of genomic sequence data has created the need to rethink comparative genomics algorithms using 'big data' approaches. Neptune is an efficient system for rapidly locating differentially abundant genomic content in bacterial populations using an exact k...

Full description

Saved in:

Bibliographic Details
Published in:	Nucleic acids research 2017-10, Vol.45 (18), p.e159-e159
Main Authors:	Marinier, Eric, Zaheer, Rahat, Berry, Chrystal, Weedmark, Kelly A, Domaratzki, Michael, Mabon, Philip, Knox, Natalie C, Reimer, Aleisha R, Graham, Morag R, Chui, Linda, Patterson-Fortin, Laura, Zhang, Jian, Pagotto, Franco, Farber, Jeff, Mahony, Jim, Seyer, Karine, Bekal, Sadjia, Tremblay, Cécile, Isaac-Renton, Judy, Prystajecky, Natalie, Chen, Jessica, Slade, Peter, Van Domselaar, Gary
Format:	Article
Language:	English
Subjects:	Bacillus anthracis - genetics Bacteria - genetics Computational Biology - methods DNA Mutational Analysis - methods Gene Expression Regulation, Bacterial Genetic Association Studies Genome, Bacterial Methods Online Microbiological Techniques - methods Sequence Analysis, DNA - methods Software Transcriptome Vibrio cholerae - genetics
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The ready availability of vast amounts of genomic sequence data has created the need to rethink comparative genomics algorithms using 'big data' approaches. Neptune is an efficient system for rapidly locating differentially abundant genomic content in bacterial populations using an exact k-mer matching strategy, while accommodating k-mer mismatches. Neptune's loci discovery process identifies sequences that are sufficiently common to a group of target sequences and sufficiently absent from non-targets using probabilistic models. Neptune uses parallel computing to efficiently identify and extract these loci from draft genome assemblies without requiring multiple sequence alignments or other computationally expensive comparative sequence analyses. Tests on simulated and real datasets showed that Neptune rapidly identifies regions that are both sensitive and specific. We demonstrate that this system can identify trait-specific loci from different bacterial lineages. Neptune is broadly applicable for comparative bacterial analyses, yet will particularly benefit pathogenomic applications, owing to efficient and sensitive discovery of differentially abundant genomic loci. The software is available for download at: http://github.com/phac-nml/neptune.
ISSN:	0305-1048 1362-4962
DOI:	10.1093/nar/gkx702