Video codecs in SoCs using OCP-based programmable accelerator design | Network Systems Designline

Get the latest news, products and how-to information on network systems. Sign up for the Network Systems DesignLine newsletter, a weekly e-mail guide dedicated to the needs of engineers developing networking equipment and components. Here is our RSS feed.








 Network Systems DesignLine » How-To » Enterprise Networking

 
 HOW-TO : Enterprise Networking

Video codecs in SoCs using OCP-based programmable accelerator design

Flexibility is increasingly necessary when supporting multiple standards, such as the VC-1 and H.264 video codecs within a single SoC. This flexibility can be achieved by having programmable state machines instead of hardwired state machines.

Print This Story Send As Email Discuss This Story Reprints

Page 1 of 4

Courtesy of Video Imaging DesignLine

Rate this article
WORSE | BETTER
1 2 3 4 5
OCP standardizes the communication and infrastructure in SoC designs and thereby ensures interoperability between the IP. Using the Open Core Protocol, System on Chip designers can analyze and evaluate various processor, interconnect, memory and peripheral IP alternatives during the sub-system or platform architecture exploration. While processors are mainly obtained by third-party IP providers, SoC designers differentiate their design by the overall SoC architecture, algorithms and the implementation of specific blocks. Those specific blocks are, to a large extent, hardware accelerators, where data-path and implementation have been the key differentiators. With the increasing data-rates, functionality and complexity of today's standards for video and wireless, design efficiency and flexibility of those blocks are becoming the most important differentiators.

Flexibility is becoming crucial for efficient design re-use in SoCs and derivatives where features and functionality are added over time. Furthermore, flexibility is increasingly necessary when supporting multiple standards (and modes), such as the VC-1 and H.264 video Codecs, within a single SoC. This flexibility can be achieved by having programmable state machines instead of hardwired state machines in those blocks. As a bonus, programmability would allow for late changes to be done in software, mitigating the risk of design errors.

Next-generation designs, especially for new video and wireless standards, have to handle enormous data-rates which results in tremendous throughput and computing requirements. Enabling programmability cannot result in decreased performance or energy efficiency as is the case in standard processors compared to hardwired logic. Therefore, more and more designers are adopting a design paradigm that combines the advantages of processors and hardwired logic into so-called Programmable Accelerators.

A CoWare customer has presented a programmable accelerator for a video deblocking filter unit for standard resolution in set-top boxes at 160MHz. In addition CoWare has shown design examples of a programmable accelerator for a video deblocking filter unit that can operate at 200 MHz, support full, high-definition resolution and frame-rate, and at the same time be re-usable for the VC-1 and the H.264 video CODEC. This performance is achieved through the application-specific deployment of any conceivable computer architecture features: The data-path of a Programmable Accelerator is likely to be massively parallel and highly specialized for certain tasks. Similar to a hardwired implementation, functional units can execute in parallel. However, the control of the functional units is not fixed, as in a hardwired implementation, but taken over by an instruction decoder and a program, similar to a processor. This makes the function reusable in different applications and variations of an algorithm. Advanced programming schemes like software pipelining, combined with highly specialized parallel data paths, allows for an optimal utilization of the functional units. This achieves the highest throughput at lowest clock frequency. The functional units can communicate via dedicated registers and buffers and are not limited, for instance, by the bit-width and size of a general purpose register file as is the case for processor instruction-set extensions.

Especially video codecs provide a huge amount of data parallelism due to their block based structure. This means, that most of the computations are not performed on a single pixel, but on a block of pixels. Thus, dedicated acceleration units with a wide data-path (e.g. 16 x 16 x 8bit = 2048bit) speed up the codec by up to three orders in magnitude compared to a pure software solution. However, designers cannot afford paying the performance of hardware accelerators with limited flexibility. Especially on the encoder side, flexibility and programmability is key as the heuristics for e.g. the motion estimation are the key critical factor for better compression, better quality and thus better products. In a Programmable Accelerator the designer can break up the accelerator data-path into small highly re-usable and programmable units. This is possible through the tight link between acceleration units and control software code in a Programmable Accelerator. In contrast, traditional hardwired accelerators have to be controlled by a separated controller in the SoC. Here, the synchronization and scheduling via interrupts and memory mapped register interfaces costs expensive cycles. Therefore, the task implemented on a hardwired accelerator had to be large enough to justify the control and synchronization overhead with giving up flexibility.

However, in a Programmable Accelerator these acceleration units can be controlled by control software code that is running on the accelerator itself without additional overhead For example, in a video encoder this enables designer to continue to improve the encoder quality by tweaking software programmable instead of hardwired heuristics even after the hardware architecture is fixed.

Next: Integration into SoC Platform



Page 2: next page Print This Story Send As Email Discuss This Story Reprints

Page 1 | 2 | 3 | 4


 
eSearch  

 Top 5 Most Read
 How-To Stories
1. 2. 3. 4. 5.

 Top 5 Most Read
 News Stories
1. 2. 3. 4. 5.

  • Introduction to Optical Transmission Systems

  • Optimizing Embedded Systems for Broadband 10 Gigabit Ethernet Connectivity

  • Interfacing a DS3231 with an 8051-Type Microcontroller

  • The entire library >>  

     
     Top 5 Most Read
     Product Stories
    1. 2. 3. 4. 5.

     Sponsor

    EE Times TechCareers
    Search Jobs

    Enter Keyword(s):


    Function:


    State:
      

    Post Your Resume
    -----------------
    Employers Area
    Most Recent Posts More career-related news, resources and job postings for technology professionals

     Tech Library
    ¤ Looking for the appropriate Industry Association? This comprehensive, up-to-date list will take you to the right Web site for the help you need.

    ¤ Got a question about a standard? Here are direct links to resources detailing the industry's most important communications standards.

    ¤ Freshen up on technology, new and old, with these links to interesting and informative tutorials.

    More from TechLibrary

    Welcome to our DesignLine network of web communities. On these sites, we provide practical how-to technical information for engineers and engineering managers involved in Automotive,audio, DSP, DTV, EDA, Industrial Control, Mobile Handset, Power Management, Programmable Logic,RF,Video, and Wireless networking design. Check out the sites and let us know your thoughts.
     



    Career Center | CommsDesign.com | Embedded.com | EE Times | TechOnline
    Planet Analog | DeepChip | eeProductCenter | Electronic Supply & Manufacturing | Webinars