Loading…

Scalable row-based parallel H.264 decoder on embedded multicore processors

Multimedia applications are present in most mobile hand-held devices, which are still equipped with limited battery resources. The H.264 standard is currently dominating the video compression world. H.264 has high computational requirements in terms of memory, energy, and time. Many techniques emerg...

Full description

Saved in:
Bibliographic Details
Published in:Signal, image and video processing image and video processing, 2015-12, Vol.9 (Suppl 1), p.57-71
Main Authors: Baaklini, Elias, Rethinagiri, Santhosh, Sbeity, Hassan, Niar, Smail
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Multimedia applications are present in most mobile hand-held devices, which are still equipped with limited battery resources. The H.264 standard is currently dominating the video compression world. H.264 has high computational requirements in terms of memory, energy, and time. Many techniques emerged that optimize parallel task granularity on multicore systems ranging from groups of pictures until the smallest block of pixels. A scalable parallel technique for the motion compensation phase is proposed in this research that is based on processing of groups of macroblock rows. Moreover, a light dependency detection algorithm is added to the prediction phase that enables parallel execution and minimizes synchronization stall time. Furthermore, a parallel implementation of the deblocking filter is also implemented. The overall result is an efficient and highly scalable parallel H.264 decoder that is evaluated on a real-board platform composed of an ARM Cortex-A9 MPCore with four processors. Various low- and high-definition video sequences are used in experiments. Results show that execution time reaches a speedup of 3.3 × for motion compensation stage and an overall speedup of 2.3 × on 4 cores including communication and synchronization overhead. Energy consumption decreases up to 63 % for the whole application execution.
ISSN:1863-1703
1863-1711
DOI:10.1007/s11760-014-0633-8