8

In the Linux kernel, I found the following code:

static inline loff_t pos_from_hilo(unsigned long high, unsigned long low)
{
#define HALF_LONG_BITS (BITS_PER_LONG / 2)
    return (((loff_t)high << HALF_LONG_BITS) << HALF_LONG_BITS) | low;
}

The code is used to combine syscall arguments into one wider variable, so for example on ia32 the offset of pwritev is specified in two 32 bit registers.

On x64, loff_t and unsigned long are both 64 bit wide. In this case, the high variable is ignored and only low is used. On ia32, loff_t is 64 bit wide and unsigned long is 32 bit wide. In this case the two arguments high and low are combined.

I was wondering why the code bit-shifts twice instead of once. There's a bit more information about this code in the commit message and in an LWN article: System calls and 64-bit architectures, but no explanation for the double bit-shift.

Georg Schölly
  • 124,188
  • 49
  • 220
  • 267

1 Answers1

4

The following warning in a test app helped my figure this out:

test.c:8:27: warning: left shift count >= width of type [-Wshift-count-overflow]
    8 |     return (((loff_t)high << (2*HALF_LONG_BITS))) | low;

The double bit-shift protects against undefined behavior. From the C spec:

6.5.7 3) ... If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

On a 64 bit machine both loff_t and long are 64 bits wide. If we did the shift at once, we would shift high by 64 bits which according to the statement above is undefined behavior. Doing it in two steps makes high into 0.


PS: I wrote a test program to investigate this and to my surprise I got different results when I replaced the two bit-shifts with a single bit-shift.

Georg Schölly
  • 124,188
  • 49
  • 220
  • 267
  • 1
    I'm probably missing something here, but if the (expected?) net result is a left shift by a number of bits exactly equal to the width of the type, why not just `return low;`? – Bob__ Aug 06 '21 at 16:27
  • 1
    @Bob__: I think `BITS_PER_LONG` is the width of `unsigned long`, but `loff_t` may be wider. I believe the idea is to have code which will pack `high` and `low` into the return value in case `loff_t` is wider than `unsigned long`, and will simply return `low` if they are the same width. – Nate Eldredge Aug 06 '21 at 18:35
  • @NateEldredge Still, if I'm not mistaken, given the `typedef`s in [/include/linux/types.h](https://elixir.bootlin.com/linux/latest/source/include/linux/types.h#L46) and [/include/uapi/asm-generic/posix_types.h](https://elixir.bootlin.com/linux/latest/source/include/uapi/asm-generic/posix_types.h#L88), `loff_t` ends up beeing a `long long`, so isn't it [UB](https://stackoverflow.com/questions/4009885/arithmetic-bit-shift-on-a-signed-integer) to fill it with two `unsigned long`? – Bob__ Aug 06 '21 at 19:35
  • 2
    @Bob__: The kernel is built with `-fno-strict-overflow`, so signed integer overflow is not UB but instead is guaranteed to wrap around. So I think everything is fine. On a system where `unsigned long` is 32 bits and `loff_t` is 64, you pack `high` and `low` into the return value, and if `high` is too big, all that happens is the return value is negative (which is maybe checked elsewhere). If `unsigned long` and `loff_t` are both 64 bits, then `(((loff_t)high << HALF_LONG_BITS) << HALF_LONG_BITS)` is guaranteed to evaluate to 0, and we just return `low`. – Nate Eldredge Aug 06 '21 at 20:07
  • 1
    @NateEldredge I see, thanks. I'd have written something like [this](https://godbolt.org/z/zjYoajTjo), but that's immaterial. – Bob__ Aug 06 '21 at 21:00