2

Mapping signed to unsigned integers bijectively can be done using common techniques like Two's complement. Unfortunately, they fail to map small negative integers to small numbers. For compression algorithms, we often want to preserve the absolute value of numbers as much as possible: small negative and positive numbers must be mapped to small numbers.

A popular map is r(x)= -2x-1 if x<0 and r(x) = 2x if x>=0. (A similar one is r(x) = -2x if x<=0 and 2x+1 if x>0.)

Implemented naively, this map is relatively slow. Certainly much slower than merely casting signed integers to unsigned integers (which silently applies Two's Complement).

What's the fastest way?

Daniel Lemire
  • 3,470
  • 2
  • 25
  • 23
  • x > 0 instead of x ≥ 0 right? – kennytm Aug 24 '10 at 14:47
  • 1
    I don't think it makes sense to ask a performance question without specifying some hardware (and a compiler, for solutions in C). – Steve Jessop Aug 24 '10 at 15:37
  • @KennyTM: Assuming the signed input is 2's complement, it has to be `x >= 0` (not `x > 0` as the questioner says). Otherwise, MIN_INT and 0 both map to 0, contradicting "bijection". – Steve Jessop Aug 24 '10 at 15:45

3 Answers3

3

silently applies Two's complement, yes, on most mondern platforms this is just a nop the compiler just interprets your data differently. So this is really hard to beat.

For your comparision it would be good if you'd provide some benchmarks...

If I am not mistaken 2 * abs(x) + signbit can be done with a cyclic left shift of the bits. So if your platform has such an instruction, this should be easy to implement with inline assembler and should result in just one instruction at the end.

Edit: I was mistaken, this simple thing would only work with sign and value representation of negative numbers.

For two's complement you'd have to negate the result of the rotation if the input had been negative. Something like

inline uint64_t rotateL(uint64_t x) {
  asm ("rolq %0" : "=r" (x) : "r" (x));
  return x;
}

inline uint64_t value(uint64_t x) {
  uint64_t ret = rotateL(x);
  return (ret % UINT64_C(2))
    ? -ret
    : ret;
}

I looked into the assembler that it produced by gcc. Looks quite good, has just 5 instructions

rolq    %rax
movq    %rax, %rdx
negq    %rdx
testb   $1, %al
cmovne  %rdx, %rax
Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
  • If I'm not mistaken cyclic left shift of 0xFF produces 0xFF and that's not what Daniel wants – Maciej Hehl Aug 24 '10 at 15:59
  • @Maciej: right would only work for the `sign and magnitude` representation. I'll edit my answer. – Jens Gustedt Aug 24 '10 at 16:29
  • You don't need inline asm for rotates; compilers recognize the `x<<1 | x>>63` rotate idiom. [Best practices for circular shift (rotate) operations in C++](https://stackoverflow.com/q/776508) – Peter Cordes Apr 14 '21 at 09:40
1

If you operate on bytes lookup table must be the fastest.

For larger integers, CMOV-based implementation would be decent. You can also utilize SSE instructions here.

BarsMonster
  • 6,483
  • 2
  • 34
  • 47
0

If your implementation supports arithmetic right-shift of signed values, try this:

#define MAP(x) ((x)<<1 ^ (x)>>sizeof(x)*CHAR_BIT-1)

For implementations which don't have signed right-shift, you can use:

#define RSHIFT(x,n) ((x)>>n | \
  (-(1 & (x)>>sizeof(x)*CHAR_BIT-1))<<sizeof(x)*CHAR_BIT-1-n<<1))

(You should check this for correctness...)

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711