Loading…

Statistical analysis of co-occurrence patterns in microbial presence-absence datasets

Drawing on a long history in macroecology, correlation analysis of microbiome datasets is becoming a common practice for identifying relationships or shared ecological niches among bacterial taxa. However, many of the statistical issues that plague such analyses in macroscale communities remain unre...

Full description

Saved in:
Bibliographic Details
Published in:PloS one 2017-11, Vol.12 (11), p.e0187132-e0187132
Main Authors: Mainali, Kumar P, Bewick, Sharon, Thielen, Peter, Mehoke, Thomas, Breitwieser, Florian P, Paudel, Shishir, Adhikari, Arjun, Wolfe, Joshua, Slud, Eric V, Karig, David, Fagan, William F
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Drawing on a long history in macroecology, correlation analysis of microbiome datasets is becoming a common practice for identifying relationships or shared ecological niches among bacterial taxa. However, many of the statistical issues that plague such analyses in macroscale communities remain unresolved for microbial communities. Here, we discuss problems in the analysis of microbial species correlations based on presence-absence data. We focus on presence-absence data because this information is more readily obtainable from sequencing studies, especially for whole-genome sequencing, where abundance estimation is still in its infancy. First, we show how Pearson's correlation coefficient (r) and Jaccard's index (J)-two of the most common metrics for correlation analysis of presence-absence data-can contradict each other when applied to a typical microbiome dataset. In our dataset, for example, 14% of species-pairs predicted to be significantly correlated by r were not predicted to be significantly correlated using J, while 37.4% of species-pairs predicted to be significantly correlated by J were not predicted to be significantly correlated using r. Mismatch was particularly common among species-pairs with at least one rare species (
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0187132