0

The problem is to convert string to int in RISC-V

if any char that is not 0~9 exist, return -1 immediately

but I wonder if there's any way to check it by using minimum instruction

my way is to put 48 and 57 (which correspond to 0~9 in ASCII) in temp register,
and use 2 branches, firstly check <=57, then check >=48

But it use too many instruction, and require additional temp register to store 48 and 57. Is there any other efficient way?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
王韋翰
  • 25
  • 1
  • 8
  • 2
    Yes, since you have to subtract `'0'` anyway, do that and then unsigned compare `c <= 9` or `c < 10`. See [What is the idea behind ^= 32, that converts lowercase letters to upper and vice versa?](https://stackoverflow.com/a/54585515) for the range-check trick. [NASM Assembly convert input to integer?](https://stackoverflow.com/a/49548057) is an x86 asm implementation of a loop using that idea. It should translate well to RISC-V; try writing it in C and using a compiler. – Peter Cordes Oct 21 '20 at 05:05
  • I was curious what I could get GCC / clang to emit for a C version; turns out not as good as my hand-written x86 asm so I wrote up an answer. Looks like some missed optimizations for the RISC-V version, too. BTW, gcc/clang know the range-check trick I mentioned; you don't usually need to implement it manually. But combining it with `-= '0'` to convert to an integer is useful here. – Peter Cordes Oct 21 '20 at 13:32

1 Answers1

0

Yes, since you have to subtract '0' anyway, do that and then unsigned compare c <= 9 or c < 10. See What is the idea behind ^= 32, that converts lowercase letters to upper and vice versa? for the range-check trick.

We can do this in C and then see how it compiles, as a starting point for a compact RISC-V implementation. This C is structured like the asm in NASM Assembly convert input to integer?, in hopes that GCC or clang will use similar loop structure. If you translate it by hand, you might want this kind of loop structure, or tweak it for better software-pipelining on an in-order RISC-V, especially to hide load-use latency. This loop structure is great on a modern x86 where OoO speculative exec hides branch and load-use latency.

// C intentionally written exactly like hand-written asm
// Translate this to asm by hand, including the loop structure.
// or compile it if you want more bloated asm.

unsigned str_to_uint(const unsigned char *p) {
    unsigned dig = *p - '0';
    unsigned total = dig;  // peel first iter, optimize away the  + 0 * 10
    if (total < 10)        // <10 can share a constant with *10
        goto loop_entry;
    else // fall through to the uncommon case of no valid digits
        return 0;

    do {
        total = total*10 + dig;
     loop_entry:            // branch target = loop entry point
        dig = *++p - '0';
    } while(dig < 10);

    return total;
}

I skip the total * 10 + dig on the first iteration using a taken branch, so we might as well have that be our entry into the loop to minimize the amount of total code.

Another option would be to peel another loop iteration fall into the top of the loop. This is what GCC and clang choose, when compiling with -O3 or -O2. With -Os gcc de-optimizes it to a loop with a j at the bottom and btgu break in the middle. (Godbolt compiler explorer). I don't know any -march= RISC-V arch or tune options to try.

So if you want a good balance of code-size and efficiency (especially for the often-common case of 1 or 2 digit numbers), you should probably "compile" it by hand.

GCC uses (x<<3) + (x<<1) to multiply by 10; clang uses mul (and inside the loop does share a constant between mul and the bltu loop branch. Unfortunately outside the loop clang compares against 9, like 9 < total, so it needs both constants. (Does RISC-V have a bge <= compare? IDK, TODO / edit welcome on whether this is a missed optimization or not).

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847