2

I have an emulated runtime environment where I am handling certain functions outside of the guest. One of the functions is a strlen function, which can read memory potentially up to SSIZE_MAX / PTRDIFF_MAX bytes1

size_t strlen(const char* str)
{
    register const char* a0 asm("a0") = str;
    register size_t      a0_out asm("a0");
    register long syscall_id asm("a7") = SYSCALL_STRLEN;

    asm volatile ("ecall" : "=r"(a0_out) :
        "r"(a0), "m"(*(const char(*)[4096]) a0), "r"(syscall_id));
    return a0_out;
}

The problem I am having is that while GCC is happy when I remove the magic 4096, Clang isn't. GCC treats the size as (I assume) unbounded, but for Clang that simply would not compile, so I am forced to put a number there, I believe.

Is the magic 4096 a problem? What options do I have here?

Footnote 1: In GNU C object sizes are limited to less than SIZE_MAX to make pointer subtraction easy / efficient.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
gonzo
  • 442
  • 4
  • 15
  • Related: https://stackoverflow.com/questions/56432259/how-can-i-indicate-that-the-memory-pointed-to-by-an-inline-asm-argument-may-be – Nate Eldredge Mar 20 '22 at 03:20
  • 1
    If all else fails, a `memory` clobber would at least guarantee correctness, even if it inhibits some optimizations that might otherwise be possible. – Nate Eldredge Mar 20 '22 at 03:22
  • 1
    Yes, IIRC last time I looked at how GCC optimized around an unspecified-size dummy memory operand, it treated it as unbounded. Same for runtime-variable sizes; GCC didn't try to reason about `(char (*)[n]) ptr` not interacting with accesses to `ptr[n+1]`. Hard-coding `4096` *could* be a problem for clang, or for future compiler versions, if they assume that stores past that are dead, or can be delayed. You might possibly be able to keep clang happy with a runtime-variable size, possibly using `(intptr_t)str` as a conveniently non-small integer whose value isn't known at compile time. – Peter Cordes Mar 20 '22 at 03:34
  • Using str as an unknown (but also large) value seems like a good idea to try. I might just do that! – gonzo Mar 20 '22 at 03:40
  • 1
    IDK if it would hurt anything or be better/worse to hard-code `PTRDIFF_MAX` as the size. It's clearly going to extend past the object in many cases, which might or might not be a problem. Not sure. – Peter Cordes Mar 20 '22 at 04:52

0 Answers0