Semantically Defining Populations for 'Omics Research

The study of populations is central to ‘omics research, whether sequencing environmental samples, controlling for population structure when looking for genetic variation within a species, or studying the evolution of large clades. Researchers use different operational definitions of populations and...

Full description

Saved in:
Bibliographic Details
Published in:Biodiversity Information Science and Standards 2017-08, Vol.1, p.e20435
Main Authors: Walls, Ramona, Buttigieg, Pier Luigi
Format: Article
Language:eng
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The study of populations is central to ‘omics research, whether sequencing environmental samples, controlling for population structure when looking for genetic variation within a species, or studying the evolution of large clades. Researchers use different operational definitions of populations and communities, via the highly varied creation of operational taxonomic units (OTUs) and, in some cases, use of unclustered sequences. The use of different methods, even within one study type (Swarm, UCLUST, CD HIT, etc.), creates very different OTUs, possibly affecting interpretation and leading to questionable reproducibility. The Population and Community Ontology (PCO) offers the semantics to clarify exactly which collection of organisms (i.e., ecological community or population) was used in an investigation. When combined with methods for standardizing observational data from the BioCollections Ontology (BCO), protocol classes from the Ontology for Biomedical investigations (OBI), and characterization of environments from the Environment Ontology (ENVO), PCO can fully describe the methods used to derive organismal or species-based (i.e. taxonomic) OTUs used for biodiversity analysis and monitoring. PCO isnotwell suited to describe “OTUs” based on sequence variants that may or may not map to population or individual level variation (e.g., output of some clustering algorithms). In this case, the Sequence Ontology (SO) may be more appropriate. This presentation will describe the key ontology design patterns used in the PCO and provide examples of how and when PCO and related ontologies should be used in omics research, with a focus on environmental/metagenomic sequencing applications.
ISSN:2535-0897
2535-0897