Network Systems Designline | Program and optimize C code--Part 2

Get the latest news, products and how-to information on network systems. Sign up for the Network Systems DesignLine newsletter, a weekly e-mail guide dedicated to the needs of engineers developing networking equipment and components. Here is our RSS feed.








 Network Systems DesignLine » How-To » Metro/Edge Networking

 
 HOW-TO : Metro/Edge Networking

Program and optimize C code--Part 2

This second of a five-part series shows how to optimize DSP "kernels," i.e., inner loops. It also shows how to write fast floating-point and fractional code.
Print This Story Send As Email Discuss This Story Reprints

Page 1 of 4

Courtesy of DSP DesignLine

Rate this article
WORSE | BETTER
1 2 3 4 5
[Editor's note: Part 1 introduces the basic principles of writing C code for DSP. Part 3 will explain how to access DSP features like hardware loops and circular addressing from portable C. It will be published Monday, March 5. For more programming tips, see the DSP programmer's guide.]

DSP Kernels
In the past, the performance of signal processing applications rested heavily on hand-coded "kernels." These kernel loops consisted of specialized arithmetic that consumed most of the computational load. The rest of the application consisted of generic "control code." This control code consumed only a few MIPS, and it was written in C. This division between kernels and control code heavily influenced processor design. For example, DSPs will have a set of instructions and registers specialized for DSP kernels, and may have another set appropriate for control code.

Today, the distinction between DSP kernels and control code is blurring. Newer application kernels require all sorts of, operations, including operations that were previously seen in control code. In addition, compilers are penetrating into previously hand-coded areas. In those C written DSP kernels we are at the place where performance needs must be considered most carefully.

Floating Point
DSP algorithms are usually conceived in floating point. Most design packages that emit C (such as Matlab) primarily use double precision floating point, but the majority of DSP platforms are fixed-point machines. In the past, this meant that the C code must be transformed from floating-point into fixed-point code. This transformation entails painstaking attention to scaling in order to preserve accuracy.

As DSPs have gotten faster, it has become practical to simply leave less-critical code in floating point in order to reduce development costs—even when the target DSP lacks native floating-point instructions. This has led to a re-evaluation of the floating-point support functions that vendors provide with their C compilers. In the past these functions were an afterthought, provided only to ensure code portability. Today, they are often carefully handcrafted. In the quest for speed, these libraries may even omit some aspects of the IEEE standard—such as standards-compliant processing of NaN values—which are mathematically useful but are seldom critical for DSP applications. This is illustrated in Figure 1, which shows reference IEEE-compliant functions for ADI's Blackfin on the right. The left-hand side shows highly optimized, non-compliant functions. (These are sample figures that do not show the entire range of performance.)


Figure 1. IEEE-compliant (right) vs. non-compliant (left) floating-point libraries.

Also consider if you really need the 64-bit (or "double") precision which is the normal ANSI C portability standard. Many applications—for example those in the automotive and audio areas—only require 32-bit (or "float") precision. Using the lower precision can double your speed, whether you use native floating-point instructions or software emulation.

[Editor's note: For a great intro to floating-point arithmetic, see this tutorial.]

Fractional processing
Even if you use a high-speed library, native fractional arithmetic is a hundred times faster than software-emulated floating point. Unfortunately, the fraction is not a type found in portable C. As a result, it is difficult to let a standard ANSI compiler know that you want to use fractional arithmetic.

To solve this problem, you can evolve the language either by creating your own dialect of C or by international standards committee. The problem with creating your own dialect of C is that your code is no longer portable. The problem with going through standards committees is that it takes decades for the world to adopt a new coding standard.

Another approach is to enhance the semantic capability of the compiler in the hope that it will comprehend that complex chunks of C correspond to fractional operations. This is challenging, but it can be done. We'll look at an example in the next section.

We can also offer intrinsics (or built-in functions), which map directly to single machine instructions. This produces a clumsy but efficient programming style. We'll look at an example in the following text.

Given the drawbacks to all of these approaches, it is tempting to use C++ instead of C. C++ allows the programmer to define new types and overloaded operators. This may appear to be a natural way to express fractional arithmetic. However, the semantic gap between expression and intention is wider in C++ than it is in C, and this approach requires very careful coding and analysis. The C++ language is more powerful than C, but that means it does more for you automatically. For instance, C++ compilers may unexpectedly create constructors and destructors. Also, C++ style involves more indirection, which can cause problems in a compiler's alias analysis. For example, C++ programs tend to produce more temporary variables and common subexpressions, which the compiler must then analyze. As another example, data tends to exist within structs or objects, rather than as stand-alone variables.

Page 2: next page Print This Story Send As Email Discuss This Story Reprints

Page 1 | 2 | 3 | 4


 
eSearch  

 Top 5 Most Read
 How-To Stories
1. 2. 3. 4. 5.

 Top 5 Most Read
 News Stories
1. 2. 3. 4. 5.

  • Introduction to Optical Transmission Systems

  • Optimizing Embedded Systems for Broadband 10 Gigabit Ethernet Connectivity

  • Interfacing a DS3231 with an 8051-Type Microcontroller

  • The entire library >>  

     
     Top 5 Most Read
     Product Stories
    1. 2. 3. 4. 5.

     Sponsor

    EE Times TechCareers
    Search Jobs

    Enter Keyword(s):


    Function:


    State:
      

    Post Your Resume
    -----------------
    Employers Area
    Most Recent Posts More career-related news, resources and job postings for technology professionals

     Tech Library
    ¤ Looking for the appropriate Industry Association? This comprehensive, up-to-date list will take you to the right Web site for the help you need.

    ¤ Got a question about a standard? Here are direct links to resources detailing the industry's most important communications standards.

    ¤ Freshen up on technology, new and old, with these links to interesting and informative tutorials.

    More from TechLibrary

    Welcome to our DesignLine network of web communities. On these sites, we provide practical how-to technical information for engineers and engineering managers involved in Automotive,audio, DSP, DTV, EDA, Industrial Control, Mobile Handset, Power Management, Programmable Logic,RF,Video, and Wireless networking design. Check out the sites and let us know your thoughts.
     



    Career Center | CommsDesign.com | Embedded.com | EE Times | TechOnline
    Planet Analog | DeepChip | eeProductCenter | Electronic Supply & Manufacturing | Webinars