1

This program I created in RISC-V RARS 1.3 application is designed to take a decimal number and count how many bits are in that number. The one I am testing is the decimal number 5, and this program should work for any positive number I put on t1. Here is the code I created. The program is meant to add one counter whenever the result of the AND function is not 0, but the problem I have is that the program does not stop. Is there a solution to this problem?

_start:

li t1,2 # start with decimal 5, binary 101
li t2,1 # adding counter for AND function
li t3,0 # bit counter count
li t4,0 # to compare 0

and t5,t1,t2 # t1 & t2 = t5
bne t5,t4,label # go to label if t5 != 0
beqz t5,label2 # go to label if t5 == 0

label:
addi t3,t3,1 # add one to bit count
slli t2,t2,1 # shift left
and t5,t1,t2 # t1 & new t2 = t5
bne t5,t4,label # go to label if t5 != 0
beqz t5,label2 # go to label if t5 == 0

label2:
slli t2,t2,1 # shift left
and t5,t1,t2 # t1 & new t2 = t5

.data
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
stevenu
  • 11
  • 2
  • Can you write this in C first? It is so much harder to develop algorithms in assembly when you don't yet know assembly language. I recommend C because you can verify that the C version works with some simple testing (translating a broken algorithm can be very painful to debug at the assembly level). Once you have a working C algorithm, then translate that into assembly literally, variable for variable, and line of code for line of code, without trying to make improvements during translation. – Erik Eidt Sep 24 '20 at 23:27
  • 1
    `li t1,2 # start with decimal 5` - nope, `li t1, 2` puts the decimal number `2` into the register, `0...00010`. Obviously your algorithm is broken, too, but it's not a good start when the first thing in your code doesn't match the comments. – Peter Cordes Sep 25 '20 at 00:12
  • 1
    RISC-V has an always-zero register x0. You don't need to load `0` into another register to compare against. https://en.wikichip.org/wiki/risc-v/registers. You also don't need `bne` and `beqz`, just `beqz label2` or fall through into the loop. – Peter Cordes Sep 25 '20 at 00:15

2 Answers2

0

Since you start with t2 = 1 and multiply it by 2 in each iteration, you should stop the calculation once the value of t2 becomes greater than t1.

Also, in your code, I see you that you probably intended to handle two cases:

  1. label: - this block handles the case when the current bit tested is 1, it increments the number of bits and then jumps to either label or label2. Here, you just need to add the exit condition as mentioned above
  2. label2: - this block handles the case when the current bit tested is 0, it does not change the number of bits, but it also does not seem to continue with either label or label2. I think that is should keep looking whether there are any higher 1 bits, until the exit condition t2>t1 is reached.
mcernak
  • 9,050
  • 1
  • 5
  • 13
0

If RISC-V doesn't have an instruction that counts the number of set bits efficiently (most other CPUs do now); then the next best approach is like:

    // Value is 32 pieces with 1 bit per piece

    temp1 = (value & 0x555555555) + (value & 0xAAAAAAAA) >> 1;

    // Temp1 is 16 pieces with 2 bits per piece

    temp2 = (temp1 & 0x33333333) + (temp1 & 0xCCCCCCCC) >> 2;

    // Temp2 is 8 pieces with 4 bits per piece

    temp3 = (temp2 & 0x0F0F0F0F) + (temp2 & 0xF0F0F0F0) >> 4;

    // Temp3 is 4 pieces with 8 bits per piece

    temp4 = (temp3 & 0x00FF00FF) + (temp3 & 0xFF00FF00) >> 8;

    // Temp4 is 2 pieces with 16 bits per piece

    result = (temp4 & 0x0000FFFF) + (temp2 & 0xFFFF0000) >> 16;

    // Result is the count of all bits that were set in value (or, sum of all 32 of the original 1-bit values)

I have no idea how to write this in RISC-V assembly (I'd compile it then cut&paste the resulting assembly, but don't have a compiler for RISC-V either).

Brendan
  • 35,656
  • 2
  • 39
  • 66
  • On machines with a fast multiply, you can horizontal sum bytes in a 32 or 64-bit integer with one multiply and a right shift, as explained in [How to count the number of set bits in a 32-bit integer?](https://stackoverflow.com/q/109023) That also optimizes the first step to use a `sub` and only mask once, using `i = i - ((i >> 1) & 0x55555555);`. – Peter Cordes Sep 25 '20 at 02:57
  • For the later operations, you still only need one mask per step (big advantage on a RISC where each 32-bit value has to be constructed separately with 2 instructions). Shift *before* masking, like `t2 = (t1 & 0x33..33) + ((t1>>2) & 0x33..33)`. Unlike x86 or AArch64, you can't just use each mask as an immediate for an AND instruction. Even constructing one mask from the other with a shift would cost extra instructions because you still have to right-shift the result before adding. – Peter Cordes Sep 25 '20 at 03:01