3

The ARM7-command set offers efficient ways to right rotate 32-bit values by an arbitrary amount in assembler. For the 2nd operand of an operation it is even "for free" by specifying ror #n as shifter operand, but for 64-bit integers no direct support by the instruction set is given. Besides the special cases of rotating by 1, 31, 33 or 63 bit positions (not to mention 0 or 32), I only know how to rotate a 64-bit value using four instructions (it's quite easy, so I don't write it here). In the four special cases I can reduce this to three instructions, but I don't know how to do it in general. So here is my question:

Given a 64-bit value in two registers, say R0 and R1, is it possible to right rotate this value by n positions (for arbitrary n) with just three ARM7 instructions?

starblue
  • 55,348
  • 14
  • 97
  • 151
Whoever
  • 33
  • 1
  • 7

2 Answers2

2

If a register (e.g. r4) happens to hold the proper magic constant (1 shifted left by the desired left-rotate amount) I think one can do it in two instructions:

  umull r3,r2,r1,r4
  umlal r2,r3,r0,r4

Slower than using four single-cycle instructions, but even if one has to load r4 with the proper constant it's still more compact than the four-instruction methods.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • Thanks, as one can set up the correct value in r4 in one command, this answers my question! But - as you guessed - I was looking for a more efficient version, so I'll wait for maybe a week hoping for a three-commands-in-three-cycles-solution before accepting your answer. – Whoever Mar 30 '11 at 12:00
  • You need to chop off and assemble four partial words, and none of the ARM7-TDMI instructions except for the multiplies can write to, nor apply a shift to, more than one register. – supercat Mar 30 '11 at 14:39
  • I don't have much hope in the existence of a three-commands-in-three-cycles solutions either. But one never knows. Maybe there is a magic first command which allows to finish off the problem with the next two commands. – Whoever Mar 31 '11 at 08:17
1

If there is a solution to this, gcc also doesn't recognize it:

unsigned long long int reg64 = random_value;
unsigned int n = shift_value;
reg64 = (reg64 >> (n%64)) | (reg64 << ((64-n)%64));

results in the following:

n=1:

MOVS R2, R0, LSR #1
MOV R3, R1, RRX
ORR R2, R2, R1, ASL #31

n=2-31:

MOV R2, R0, LSR #n
ORR R2, R2, R1, ASL #32-n
MOV R3, R0, ASL #32-n
ORR R3, R3, R1, LSR #n

n=33-62:

MOV R3, R0, ASL #64-n
ORR R3, R3, R1, LSR #n-32
MOV R2, R0, LSR, #n-32
ORR R2, R2, R1, ASL #64-n

n=63:

ADDS R2, R0, R0
ADC R3, R1, R1
ORR R2, R2, R1, LSR #31
rudi-moore
  • 2,650
  • 1
  • 19
  • 16
  • Thanks for trying out the gcc. The fact that it needs four commands for n=33 (instead of using the formula for n=1 with R0 and R1 exchanged) shows that is not very optimized for rotating 64-bit integers in ARM. – Whoever Mar 29 '11 at 08:53
  • In general, the ARM ISA is very hard for compilers. It's meant for use by humans. – ninjalj Mar 29 '11 at 18:30
  • @ninjalj: I'd guess that some human hard-coded the special cases n=1 and n=63 in the gcc. – Whoever Mar 30 '11 at 08:46