Loading…

Genome assembly and transcriptome resource for river buffalo, Bubalus bubalis (2n = 50)

Abstract Water buffalo is a globally important species for agriculture and local economies. A de novo assembled, well-annotated reference sequence for the water buffalo is an important prerequisite for studying the biology of this species, and is necessary to manage genetic diversity and to use mode...

Full description

Saved in:
Bibliographic Details
Published in:Gigascience 2017-10, Vol.6 (10), p.1-6
Main Authors: Williams, John L, Iamartino, Daniela, Pruitt, Kim D, Sonstegard, Tad, Smith, Timothy P L, Low, Wai Yee, Biagini, Tommaso, Bomba, Lorenzo, Capomaccio, Stefano, Castiglioni, Bianca, Coletta, Angelo, Corrado, Federica, Ferré, Fabrizio, Iannuzzi, Leopoldo, Lawley, Cynthia, Macciotta, Nicolò, McClure, Matthew, Mancini, Giordano, Matassino, Donato, Mazza, Raffaele, Milanesi, Marco, Moioli, Bianca, Morandi, Nicola, Ramunno, Luigi, Peretti, Vincenzo, Pilla, Fabio, Ramelli, Paola, Schroeder, Steven, Strozzi, Francesco, Thibaud-Nissen, Francoise, Zicarelli, Luigi, Ajmone-Marsan, Paolo, Valentini, Alessio, Chillemi, Giovanni, Zimin, Aleksey
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Water buffalo is a globally important species for agriculture and local economies. A de novo assembled, well-annotated reference sequence for the water buffalo is an important prerequisite for studying the biology of this species, and is necessary to manage genetic diversity and to use modern breeding and genomic selection techniques. However, no such genome assembly has been previously reported. There are 2 species of domestic water buffalo, the river (2n = 50) and the swamp (2n = 48) buffalo. Here we describe a draft quality reference sequence for the river buffalo created from Illumina GA and Roche 454 short read sequences using the MaSuRCA assembler. The assembled sequence is 2.83 Gb, consisting of 366 983 scaffolds with a scaffold N50 of 1.41 Mb and contig N50 of 21 398 bp. Annotation of the genome was supported by transcriptome data from 30 tissues and identified 21 711 predicted protein coding genes. Searches for complete mammalian BUSCO gene groups found 98.6% of curated single copy orthologs present among predicted genes, which suggests a high level of completeness of the genome. The annotated sequence is available from NCBI at accession GCA_000471725.1.
ISSN:2047-217X
2047-217X
DOI:10.1093/gigascience/gix088