Loading…

Explore, edit and leverage genomic annotations using Python GTF toolkit

Abstract Motivation While Python has become very popular in bioinformatics, a limited number of libraries exist for fast manipulation of gene coordinates in Ensembl GTF format. Results We have developed the GTF toolkit Python package (pygtftk), which aims at providing easy and powerful manipulation...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics 2019-09, Vol.35 (18), p.3487-3488
Main Authors: Lopez, F, Charbonnier, G, Kermezli, Y, Belhocine, M, Ferré, Q, Zweig, N, Aribi, M, Gonzalez, A, Spicuglia, S, Puthier, D
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Motivation While Python has become very popular in bioinformatics, a limited number of libraries exist for fast manipulation of gene coordinates in Ensembl GTF format. Results We have developed the GTF toolkit Python package (pygtftk), which aims at providing easy and powerful manipulation of gene coordinates in GTF format. For optimal performances, the core engine of pygtftk is a C dynamic library (libgtftk) while the Python API provides usability and readability for developing scripts. Based on this Python package, we have developed the gtftk command line interface that contains 57 sub-commands (v0.9.10) to ease handling of GTF files. These commands may be used to (i) perform basic tasks (e.g. selections, insertions, updates or deletions of features/keys), (ii) select genes/transcripts based on various criteria (e.g. size, exon number, transcription start site location, intron length, GO terms) or (iii) carry out more advanced operations such as coverage analyses of genomic features using bigWig files to create faceted read-coverage diagrams. In conclusion, the pygtftk package greatly simplifies the annotation of GTF files with external information while providing advance tools to perform gene analyses. Availability and implementation pygtftk and gtftk have been tested on Linux and MacOSX and are available from https://github.com/dputhier/pygtftk under the MIT license. The libgtftk dynamic library written in C is available from https://github.com/dputhier/libgtftk.
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/btz116