6

I am going to write some image processing programs for Texas Instruments DaVinci platform. There are tools appropriate for programming in the C language, but I wonder if it is really possible to take full advantage of the DSP processor without resorting to an assembly language. Do you know about any comparisons of speed between programs written in C and in assembler on this DSP platform?

starblue
  • 55,348
  • 14
  • 97
  • 151
Michal Czardybon
  • 2,795
  • 4
  • 28
  • 40

8 Answers8

11

I've used some other TI DSPs and C was usually fine. The usual approach is to start by writing everything in C and then profile the code to see if anything needs to be hand-optimised.

You can often do the optimisation in C too, by adjusting the C code until you get the assembly output you want. It's important to know how the DSP works and what ways of working are faster or slower.

Kevin ORourke
  • 691
  • 9
  • 10
  • 5
    "You can often do the optimisation in C too, by adjusting the C code until you get the assembly output you want" -- this technique in particular has always worked great for me with Sony hardware. – Crashworks Sep 21 '09 at 10:20
  • 2
    @Crash: me too. That's all I really want a compiler to do - save me from having to write ASM. I don't care for "nanny languages" that assume I don't really know what I'm doing. – Mike Dunlavey Sep 22 '09 at 17:08
10

The TI compiler for the C64x/C64x+ DSP on the OMAP3 includes support for what TI calls "intrinsic" function calls. They're not really function calls, they are just a way to tell the compiler what assembly opcode to use for an operation that might not be directly expressable in C. It is especially useful for leveraging the SIMD opcodes in the C64x/C64x+ DSP from C.

An example might be:

A = _add2(B, C);

This SIMD instruction adds the low/high 16 bits of B and C together and store the results in the low/high 16 bits of A. You can't express this in regular C, but you can do it with the intrinsic C opcodes.

I have used intrinsic C to get very close to what you could do with full-blown assembly language (within 5-10%). It is especially useful for video functions like filtering and motion compensation (_dotpsu4!).

I usually compile with the -al switch and look at the pipeline to try and identify what functional units are overloaded and then look at my intrinsics to see if I can rebalance the loop (if I'm using too many S units, I might see if I could change the opcode to use an M unit).

Also, it's helpful to remember that the C64x DSP has 64 registers, so load up the local variables and never assign the output of an instruction back into the same variable -- it'll negatively affect the compiler's ability to pipeline properly.

Overdrive
  • 121
  • 3
7

Usually C is a good place to start. You can get the overall framework and algorithms shaken out quickly, and write most of the plumbing that moves the data around between the real math. Once that's in place and you're happy that your data structures are correct, you can look at in a profiler and figure out which routines need to be squeezed by hand.

Crashworks
  • 40,496
  • 12
  • 101
  • 170
  • @Crash: Right. What I often find is: You know what really takes time (at least the first time you write it)? Not the math. The data structure ! – Mike Dunlavey Sep 26 '09 at 14:03
  • 1
    I agree. I often get more performance just by rethinking the layout of my data. – Nosredna Oct 15 '09 at 02:46
6

The C-Compiler (as far as I tested) does not take full advantage of the architecture.

But you can get away with it, because the DSP might be fast enough for the operations you need to do.

So it comes down to testing and profiling your code to see the parts which must be speed up to get the system to work.

Christopher
  • 8,912
  • 3
  • 33
  • 38
  • Yes, not full, but what difference in efficiency did you get between C and asm? – Michal Czardybon Sep 21 '09 at 13:18
  • 2
    @Michael: If you want a general answer to which is faster, I think that's not a good question, because it always depends on the particular code you're talking about. That's why you need to test, profile, single step, whatever. If in the particular code you see a high fraction of time being spent in particular code, and you can see what C generates, and you can see how to do it better with ASM, then that's when ASM can beat C. There's no general answer. – Mike Dunlavey Sep 26 '09 at 13:57
2

Depends on the C compiler and your definition of "fast enough". Standard C compilers often struggle to make efficient use of special DSP hardware, such as:

  • Multiple memory banks that can be accessed in parallel
  • Fixed point data types
  • Circular buffers
Andrew Bainbridge
  • 4,651
  • 3
  • 35
  • 50
2

the simple compare of the speed means nothing. Definitely c if more convenient than assembler. You must measure the cost of time of your system, if c code satisfy your require for speed ,you don't have to use assembler. If the speed is not enough, you can profile your code ,find out the most time consuming source code such as loop code, then optimize it!

barry
  • 31
  • 3
1

I would stick to C until I know there is a hotspot that could benefit from assembly coding. This is the "profiling" method I use. You could be surprised that there are ways to speed up the code that are not hotspots, but rather intermediate function calls that could be removed.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
0

Compile using the -O3 optimisation. It is very powerful.
In the event it is not good enough, you can further optimise the generated assembly code to your liking instead of coding everything yourself in ASM from scratch.

toughQuestions
  • 161
  • 1
  • 8