Loading…

Microbiome Datasets Are Compositional: And This Is Not Optional

Datasets collected by high-throughput sequencing (HTS) of 16S rRNA gene amplimers, metagenomes or metatranscriptomes are commonplace and being used to study human disease states, ecological differences between sites, and the built environment. There is increasing awareness that microbiome datasets g...

Full description

Saved in:
Bibliographic Details
Published in:Frontiers in microbiology 2017-11, Vol.8, p.2224-2224
Main Authors: Gloor, Gregory B, Macklaim, Jean M, Pawlowsky-Glahn, Vera, Egozcue, Juan J
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c438t-4794289d8a60d3137044ff6d5c06f2c9a6f5ef19ced2ad5feb2a0d0a3f6918ed3
cites cdi_FETCH-LOGICAL-c438t-4794289d8a60d3137044ff6d5c06f2c9a6f5ef19ced2ad5feb2a0d0a3f6918ed3
container_end_page 2224
container_issue
container_start_page 2224
container_title Frontiers in microbiology
container_volume 8
creator Gloor, Gregory B
Macklaim, Jean M
Pawlowsky-Glahn, Vera
Egozcue, Juan J
description Datasets collected by high-throughput sequencing (HTS) of 16S rRNA gene amplimers, metagenomes or metatranscriptomes are commonplace and being used to study human disease states, ecological differences between sites, and the built environment. There is increasing awareness that microbiome datasets generated by HTS are compositional because they have an arbitrary total imposed by the instrument. However, many investigators are either unaware of this or assume specific properties of the compositional data. The purpose of this review is to alert investigators to the dangers inherent in ignoring the compositional nature of the data, and point out that HTS datasets derived from microbiome studies can and should be treated as compositions at all stages of analysis. We briefly introduce compositional data, illustrate the pathologies that occur when compositional data are analyzed inappropriately, and finally give guidance and point to resources and examples for the analysis of microbiome datasets using compositional data analysis.
doi_str_mv 10.3389/fmicb.2017.02224
format article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5695134</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1970630370</sourcerecordid><originalsourceid>FETCH-LOGICAL-c438t-4794289d8a60d3137044ff6d5c06f2c9a6f5ef19ced2ad5feb2a0d0a3f6918ed3</originalsourceid><addsrcrecordid>eNpVkclLAzEUxoMoVqp3TzJHL60vy2RmPCilruByqeAtpFlsZGZSk6ngf2-6WGogG-99X97LD6FTDENKy-rCNk5NhwRwMQRCCNtDR5hzNqBA3vd3zj10EuMnpMGApPUQ9UiFy6KkxRG6fnYq-KnzjcluZCej6WI2CiYb-2buo-ucb2V9mY1anU1mLmaPMXvxXfY6X0eO0YGVdTQnm72P3u5uJ-OHwdPr_eN49DRQjJbdgBUVI2WlS8lBU0wLYMxarnMF3BJVSW5zY3GljCZS59ZMiQQNklqeSjWa9tHV2ne-mDZGK9N2QdZiHlwjw4_w0on_kdbNxIf_FjmvckxZMsBrAxUXSgSjTFCyWwm3l-UkUBCRvg0IT5rzzaPBfy1M7ETjojJ1LVvjF1HgqgBOIXWTUmFjH3yMwdhtaRjEEpdY4RJLXGKFK0nOdlvaCv7g0F-J75Gf</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1970630370</pqid></control><display><type>article</type><title>Microbiome Datasets Are Compositional: And This Is Not Optional</title><source>PubMed Central</source><creator>Gloor, Gregory B ; Macklaim, Jean M ; Pawlowsky-Glahn, Vera ; Egozcue, Juan J</creator><creatorcontrib>Gloor, Gregory B ; Macklaim, Jean M ; Pawlowsky-Glahn, Vera ; Egozcue, Juan J</creatorcontrib><description>Datasets collected by high-throughput sequencing (HTS) of 16S rRNA gene amplimers, metagenomes or metatranscriptomes are commonplace and being used to study human disease states, ecological differences between sites, and the built environment. There is increasing awareness that microbiome datasets generated by HTS are compositional because they have an arbitrary total imposed by the instrument. However, many investigators are either unaware of this or assume specific properties of the compositional data. The purpose of this review is to alert investigators to the dangers inherent in ignoring the compositional nature of the data, and point out that HTS datasets derived from microbiome studies can and should be treated as compositions at all stages of analysis. We briefly introduce compositional data, illustrate the pathologies that occur when compositional data are analyzed inappropriately, and finally give guidance and point to resources and examples for the analysis of microbiome datasets using compositional data analysis.</description><identifier>ISSN: 1664-302X</identifier><identifier>EISSN: 1664-302X</identifier><identifier>DOI: 10.3389/fmicb.2017.02224</identifier><identifier>PMID: 29187837</identifier><language>eng</language><publisher>Switzerland: Frontiers Media S.A</publisher><subject>Aparell digestiu ; Bayesian estimation ; Bayesian statistical decision theory ; Biologia ; Ciències de la salut ; compositional data ; correlation ; count normalization ; Dietètica i nutrició ; Estadística aplicada ; high-throughput sequencing ; Matemàtiques i estadística ; Medicina ; Microbiology ; Microbiota ; relative abundance ; Àrees temàtiques de la UPC</subject><ispartof>Frontiers in microbiology, 2017-11, Vol.8, p.2224-2224</ispartof><rights>Attribution 3.0 Spain info:eu-repo/semantics/openAccess &lt;a href="http://creativecommons.org/licenses/by/3.0/es/"&gt;http://creativecommons.org/licenses/by/3.0/es/&lt;/a&gt;</rights><rights>Copyright © 2017 Gloor, Macklaim, Pawlowsky-Glahn and Egozcue. 2017 Gloor, Macklaim, Pawlowsky-Glahn and Egozcue</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c438t-4794289d8a60d3137044ff6d5c06f2c9a6f5ef19ced2ad5feb2a0d0a3f6918ed3</citedby><cites>FETCH-LOGICAL-c438t-4794289d8a60d3137044ff6d5c06f2c9a6f5ef19ced2ad5feb2a0d0a3f6918ed3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5695134/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5695134/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,315,733,786,790,891,27957,27958,53827,53829</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29187837$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Gloor, Gregory B</creatorcontrib><creatorcontrib>Macklaim, Jean M</creatorcontrib><creatorcontrib>Pawlowsky-Glahn, Vera</creatorcontrib><creatorcontrib>Egozcue, Juan J</creatorcontrib><title>Microbiome Datasets Are Compositional: And This Is Not Optional</title><title>Frontiers in microbiology</title><addtitle>Front Microbiol</addtitle><description>Datasets collected by high-throughput sequencing (HTS) of 16S rRNA gene amplimers, metagenomes or metatranscriptomes are commonplace and being used to study human disease states, ecological differences between sites, and the built environment. There is increasing awareness that microbiome datasets generated by HTS are compositional because they have an arbitrary total imposed by the instrument. However, many investigators are either unaware of this or assume specific properties of the compositional data. The purpose of this review is to alert investigators to the dangers inherent in ignoring the compositional nature of the data, and point out that HTS datasets derived from microbiome studies can and should be treated as compositions at all stages of analysis. We briefly introduce compositional data, illustrate the pathologies that occur when compositional data are analyzed inappropriately, and finally give guidance and point to resources and examples for the analysis of microbiome datasets using compositional data analysis.</description><subject>Aparell digestiu</subject><subject>Bayesian estimation</subject><subject>Bayesian statistical decision theory</subject><subject>Biologia</subject><subject>Ciències de la salut</subject><subject>compositional data</subject><subject>correlation</subject><subject>count normalization</subject><subject>Dietètica i nutrició</subject><subject>Estadística aplicada</subject><subject>high-throughput sequencing</subject><subject>Matemàtiques i estadística</subject><subject>Medicina</subject><subject>Microbiology</subject><subject>Microbiota</subject><subject>relative abundance</subject><subject>Àrees temàtiques de la UPC</subject><issn>1664-302X</issn><issn>1664-302X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><recordid>eNpVkclLAzEUxoMoVqp3TzJHL60vy2RmPCilruByqeAtpFlsZGZSk6ngf2-6WGogG-99X97LD6FTDENKy-rCNk5NhwRwMQRCCNtDR5hzNqBA3vd3zj10EuMnpMGApPUQ9UiFy6KkxRG6fnYq-KnzjcluZCej6WI2CiYb-2buo-ucb2V9mY1anU1mLmaPMXvxXfY6X0eO0YGVdTQnm72P3u5uJ-OHwdPr_eN49DRQjJbdgBUVI2WlS8lBU0wLYMxarnMF3BJVSW5zY3GljCZS59ZMiQQNklqeSjWa9tHV2ne-mDZGK9N2QdZiHlwjw4_w0on_kdbNxIf_FjmvckxZMsBrAxUXSgSjTFCyWwm3l-UkUBCRvg0IT5rzzaPBfy1M7ETjojJ1LVvjF1HgqgBOIXWTUmFjH3yMwdhtaRjEEpdY4RJLXGKFK0nOdlvaCv7g0F-J75Gf</recordid><startdate>20171115</startdate><enddate>20171115</enddate><creator>Gloor, Gregory B</creator><creator>Macklaim, Jean M</creator><creator>Pawlowsky-Glahn, Vera</creator><creator>Egozcue, Juan J</creator><general>Frontiers Media S.A</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>XX2</scope><scope>5PM</scope></search><sort><creationdate>20171115</creationdate><title>Microbiome Datasets Are Compositional: And This Is Not Optional</title><author>Gloor, Gregory B ; Macklaim, Jean M ; Pawlowsky-Glahn, Vera ; Egozcue, Juan J</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c438t-4794289d8a60d3137044ff6d5c06f2c9a6f5ef19ced2ad5feb2a0d0a3f6918ed3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Aparell digestiu</topic><topic>Bayesian estimation</topic><topic>Bayesian statistical decision theory</topic><topic>Biologia</topic><topic>Ciències de la salut</topic><topic>compositional data</topic><topic>correlation</topic><topic>count normalization</topic><topic>Dietètica i nutrició</topic><topic>Estadística aplicada</topic><topic>high-throughput sequencing</topic><topic>Matemàtiques i estadística</topic><topic>Medicina</topic><topic>Microbiology</topic><topic>Microbiota</topic><topic>relative abundance</topic><topic>Àrees temàtiques de la UPC</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gloor, Gregory B</creatorcontrib><creatorcontrib>Macklaim, Jean M</creatorcontrib><creatorcontrib>Pawlowsky-Glahn, Vera</creatorcontrib><creatorcontrib>Egozcue, Juan J</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Recercat</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Frontiers in microbiology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gloor, Gregory B</au><au>Macklaim, Jean M</au><au>Pawlowsky-Glahn, Vera</au><au>Egozcue, Juan J</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Microbiome Datasets Are Compositional: And This Is Not Optional</atitle><jtitle>Frontiers in microbiology</jtitle><addtitle>Front Microbiol</addtitle><date>2017-11-15</date><risdate>2017</risdate><volume>8</volume><spage>2224</spage><epage>2224</epage><pages>2224-2224</pages><issn>1664-302X</issn><eissn>1664-302X</eissn><notes>ObjectType-Article-2</notes><notes>SourceType-Scholarly Journals-1</notes><notes>ObjectType-Feature-3</notes><notes>content type line 23</notes><notes>ObjectType-Review-1</notes><notes>Edited by: Jessica Galloway-Pena, University of Texas MD Anderson Cancer Center, United States</notes><notes>Reviewed by: Ionas Erb, Centre for Genomic Regulation, Spain; Jennifer Stearns, McMaster University, Canada</notes><notes>This article was submitted to Systems Microbiology, a section of the journal Frontiers in Microbiology</notes><abstract>Datasets collected by high-throughput sequencing (HTS) of 16S rRNA gene amplimers, metagenomes or metatranscriptomes are commonplace and being used to study human disease states, ecological differences between sites, and the built environment. There is increasing awareness that microbiome datasets generated by HTS are compositional because they have an arbitrary total imposed by the instrument. However, many investigators are either unaware of this or assume specific properties of the compositional data. The purpose of this review is to alert investigators to the dangers inherent in ignoring the compositional nature of the data, and point out that HTS datasets derived from microbiome studies can and should be treated as compositions at all stages of analysis. We briefly introduce compositional data, illustrate the pathologies that occur when compositional data are analyzed inappropriately, and finally give guidance and point to resources and examples for the analysis of microbiome datasets using compositional data analysis.</abstract><cop>Switzerland</cop><pub>Frontiers Media S.A</pub><pmid>29187837</pmid><doi>10.3389/fmicb.2017.02224</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1664-302X
ispartof Frontiers in microbiology, 2017-11, Vol.8, p.2224-2224
issn 1664-302X
1664-302X
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5695134
source PubMed Central
subjects Aparell digestiu
Bayesian estimation
Bayesian statistical decision theory
Biologia
Ciències de la salut
compositional data
correlation
count normalization
Dietètica i nutrició
Estadística aplicada
high-throughput sequencing
Matemàtiques i estadística
Medicina
Microbiology
Microbiota
relative abundance
Àrees temàtiques de la UPC
title Microbiome Datasets Are Compositional: And This Is Not Optional
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-09-21T07%3A54%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Microbiome%20Datasets%20Are%20Compositional:%20And%20This%20Is%20Not%20Optional&rft.jtitle=Frontiers%20in%20microbiology&rft.au=Gloor,%20Gregory%20B&rft.date=2017-11-15&rft.volume=8&rft.spage=2224&rft.epage=2224&rft.pages=2224-2224&rft.issn=1664-302X&rft.eissn=1664-302X&rft_id=info:doi/10.3389/fmicb.2017.02224&rft_dat=%3Cproquest_pubme%3E1970630370%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c438t-4794289d8a60d3137044ff6d5c06f2c9a6f5ef19ced2ad5feb2a0d0a3f6918ed3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1970630370&rft_id=info:pmid/29187837&rfr_iscdi=true