1

I'm using an ARM966E-S RISC-CPU and was wondering how to use the apparently available instruction set extensions for better DSP performance, e. g. an enhanced multiplier instruction.

I've read in the technical reference manual that these instruction set extensions are available but I don't know how to use/activate them.

Can anybody help?

Thanks in advance!

Iniesta8
  • 399
  • 2
  • 15
  • What happens if you compile `int64_t res = (int64_t)i * (int64_t)j;` and disassemble the result? Does it generate `SMULL`? The long multiplications should just work. But for saturating and parallel arithmetic you will have to use intrinsics as they don't map nicely to "C". – NickJH Oct 23 '17 at 10:17
  • Use the `-march=armv5te` to tell the compiler you have the `extended` DSP instructions. The 'ARM ARM" (architecture reference manual) has details on what these things mean. There is also 'jazelle' and 's' variants. `PLD`, `LDRD' and `MULL` are additions. Look at [gcc's config/arm directory](https://gcc.gnu.org/git/?p=gcc.git;f=gcc/config/arm;hb=HEAD) which defines code generation. You can find what options you might need for your GCC version to get the compiler to emit stuff. – artless noise Oct 24 '17 at 15:08
  • Take a look at [get gcc to emit idiv instructions](https://stackoverflow.com/questions/15782089/gcc-to-emit-arm-idiv-instructions). It is kind of more complex than you might think. There is first the possibility to use the instructions, then there is it more efficient than other instructions. The ARM pipeline and 'C' semantics may/may not permit use of the instructions. The first step is to specify a CPU that will permit emission of the opcode (as per old timer/dwelchs answer). Doing this you can use inline assembler even if 'C' won't coax to use it. Older gccs have arm-cores.def. – artless noise Oct 24 '17 at 15:17
  • Also if you are deep embedded, you can use `-mcpu=arm966e-s` and it should work for some versions of GCC; definitely use `-mtune=arm966e-s`. In case you have some other compiler, you should specify. Please consider altering your tags for a particular compiler and possibly add 'C'? – artless noise Oct 24 '17 at 15:29

1 Answers1

1

Why not just try it? Or read the manual for your toolchain, for example with gcc

so.s

ldrd r0,[r2]
ldr r2,[r2]

test

arm-none-eabi-as so.s -o so.o
arm-none-eabi-as -march=armv5t so.s -o so.o
so.s: Assembler messages:
so.s:3: Error: selected processor does not support `ldrd r0,[r2]' in ARM mode
arm-none-eabi-as -march=armv5te so.s -o so.o
arm-none-eabi-objdump -D so.o

so.o:     file format elf32-littlearm


Disassembly of section .text:

00000000 <.text>:
   0:   e1c200d0    ldrd    r0, [r2]
   4:   e5922000    ldr r2, [r2]
old_timer
  • 69,149
  • 8
  • 89
  • 168
  • But how can I use the fast instructions by default? I mean, my code is in written in C. I don't want to write Assembler. The compiler should always use this instructions by default... – Iniesta8 Oct 22 '17 at 10:31
  • -march=armv5te also works on the gcc command line, then it is up to the compiler as to whether or not it chooses to use them. – old_timer Oct 22 '17 at 12:29
  • If not using assembly language directly then you have to learn to tune your code to trigger the compiler to generate these instructions, if it generates them at all (grep through the gcc sources to see if they are even used, then when and why). – old_timer Oct 22 '17 at 12:31