0

On an ARM cortex m0:

How many cycles does it take to perform multiplication of single precision floats and store them into a float? i.e. x = a*b; Where x, a, and b are single precision IEE 754 float point.

hassan789
  • 101
  • 3
  • how long did it take when you timed it? How did you time it? – old_timer Jun 15 '14 at 05:56
  • 1
    Given that M0 has no FPU and has to do it in software; probably lots, and depends on the implementaton. I guess you could disassemble your floating-point library and add up the [individual instruction timings](http://infocenter.arm.com/help/topic/com.arm.doc.ddi0484c/CHDCICDF.html), but benchmarking (accounting for interrupts) is probably more sensible. – Notlikethat Jun 15 '14 at 07:57
  • I don't have any hardware to try this. But I guess, that's the only way to know – hassan789 Jun 15 '14 at 16:00
  • The calculation time for library-based floats can change based on the input stream, so when benchmarking be sure to include a variety of numbers... – Ross Jun 16 '14 at 13:26

1 Answers1

3

The answer will obviously depend on your compiler's implementation of software floating point. You could measure it or you could step the code in your debugger and count the instructions executed.

There is a question here, with an answer that suggests 35 cycles on an Intel XScale example, that may be broadly comparable with your target, however that was for an FPU emulation example. With FPU emulation, an FPU instruction causes an invalid instruction exception on hardware without an FPU, and the exception handler interprets the instruction and calls the appropriate software function - there is a small overhead in that, that you will not have in a direct software implementation.

Community
  • 1
  • 1
Clifford
  • 88,407
  • 13
  • 85
  • 165