0

I am starring at the original JPEG standard (ITU 81), in particular the figure Figure F.12: extending the sign bit of a decoded value in V:

For reference the SLL terms means: shift left logical operation (see page 15 of the PDF)

Figure F.12: extend sign bit.

Now the famous libjpeg implementation decided to implement it this way (quite direct transcription):

/*
 * Figure F.12: extend sign bit.
 * On some machines, a shift and add will be faster than a table lookup.
 */

#ifdef AVOID_TABLES

#define HUFF_EXTEND(x,s)  ((x) < (1<<((s)-1)) ? (x) + (((-1)<<(s)) + 1) : (x))

#else

#define HUFF_EXTEND(x,s)  ((x) < extend_test[s] ? (x) + extend_offset[s] : (x))

static const int extend_test[16] =   /* entry n is 2**(n-1) */
  { 0, 0x0001, 0x0002, 0x0004, 0x0008, 0x0010, 0x0020, 0x0040, 0x0080,
    0x0100, 0x0200, 0x0400, 0x0800, 0x1000, 0x2000, 0x4000 };

static const int extend_offset[16] = /* entry n is (-1 << n) + 1 */
  { 0, ((-1)<<1) + 1, ((-1)<<2) + 1, ((-1)<<3) + 1, ((-1)<<4) + 1,
    ((-1)<<5) + 1, ((-1)<<6) + 1, ((-1)<<7) + 1, ((-1)<<8) + 1,
    ((-1)<<9) + 1, ((-1)<<10) + 1, ((-1)<<11) + 1, ((-1)<<12) + 1,
    ((-1)<<13) + 1, ((-1)<<14) + 1, ((-1)<<15) + 1 };

#endif /* AVOID_TABLES */

Quite obviously shifting a negative signed value is UB, so I am wondering what the original author of the JPEG standard actually meant here. Is the JPEG standard limited to Two's complement number representation ?

malat
  • 12,152
  • 13
  • 89
  • 158
  • as far as I know, shifting a negative signed value is implementation defined (as long as the shift amount is positive and within `sizeof(operator) * CHAR_BIT`) – bolov Nov 09 '16 at 14:12
  • 3
    @bolov, *right*-shifting a negative value is implementation-defined. *Left*-shifting a value of a signed type, on the other hand, is ***un***defined unless the original value is non-negative and the mathematical result is representable in the result type. In particular, the OP is quite correct that left-shifting -1 produces UB. – John Bollinger Nov 09 '16 at 14:17
  • @JohnBollinger thank you. There is always something to (re)learn about C or C++. – bolov Nov 09 '16 at 14:19
  • @malat, sorry, but.... where do you see negative shiftings? I don't see them, just shifting negative numbers... but never a negative shift. – Luis Colorado Nov 14 '16 at 09:08
  • @JohnBollinger, right shifting a negative value is not implementation defined (nor undefined behaviour). It's defined as equivalent to the two's complement division by two, so the sign bit is extended (1 turns into 11 and 0 turns into 00 on the most signifiant part of the operand) In C, right shifting a `signed int` is different than shifting an `unsigned int`, but it's not implementation dependant or undefined behaviour. What is not defined is shifting a negative amount of bits. You cannot shift less than zero bits or more than the operand size, in bits. – Luis Colorado Nov 14 '16 at 09:12
  • The expression in the first macro call is `... << ((s) - 1)` which means to left shift _the value expanded in the `s` parameter minus one_, and not the value `-1` in any case. I think the implementor is taking the precaution of not passing a value less than `1` which should make the right part negative. Indeed, doing a left shift of something less than `1` is nonsense, anyway. Who wants to left shift zero bits some value? – Luis Colorado Nov 14 '16 at 09:20
  • @LuisColorado, the standard disagrees with you: "If [the left operand] has a signed type and a negative value, the resulting value is implementation-defined" (C2011; 6.5.7/5). – John Bollinger Nov 17 '16 at 01:02
  • @JohnBollinger, read K&R and you'll get the opposite. I think even C98 also says something about that. – Luis Colorado Nov 17 '16 at 14:29
  • @LuisColorado, C99 and even C89/C90 have identical wording to C2011 in this regard. I doubt your claim about K&R, but that's irrelevant in any case. For more than a quarter century it has been the international standard, not K&R, that defines the C language. – John Bollinger Nov 20 '16 at 23:19

3 Answers3

3

Is the JPEG standard limited to Two's complement number representation ?

With the chart using SLL rather than *2, the flow chart is relying of 2's complement implementation.

C, OTOH, is not limited to 2's complement nor using shifts in a certain "2's complement" way. Better to code without that assumption. Using unsigned types is a good first step.

// ((-1)<<15) + 1
((-1u)<<15) + 1

Need to see applications of HUFF_EXTEND() for a deeper answer.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
1

Shifting negative values is only undefined behavior in the C language. On assembler level, such shifts are perfectly fine. A logical/arithmetic shift left instruction will move the MSB into the carry bit and that's it. The CPU will not halt and catch fire in undefined ways.

That being said, you can dodge UB in C by converting your signed number to unsigned. Then you will merely have implementation-defined behavior. For example, (int)(-1u<<1) does actually not invoke UB, even though such code looks quite questionable.

The safe approach is to perform all calculations on unsigned numbers (uint32_t) and convert to signed only when it is actually needed. Never perform any form of bitwise operations on signed types.

I would completely ignore compatibility with wildly exotic/fictional systems that do not use two's complement. Focus on compatibility with real-world mainstream computers.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Nope, only shifting _negative displacements (or more than or equal to left operand size displacements) is undefined behaviour_. As left shifting is equivalent to a *by two multiplication*, the same is right shifting in C (change multiply by divide) for `signed`s. What actually happens is that `unsigned int` shifting **is different** than `signed int` value shifting. To conserve signedness and continue behaving as a divide by two operation, right shifting is impl. as _sign extend divide by two operation_ on `signed`s. `ASR`/`LSR` machine instruction is used with `signed`/`unsigned`s, resp. – Luis Colorado Nov 14 '16 at 09:35
  • @LuisColorado "right shifting is impl. as sign extend divide by two operation on signeds" Assuming this other comment is correct, that's not required by the standard, it's just a common (and legal) implementation-specific behaviour. https://stackoverflow.com/a/1857965/ – Max Barraclough May 05 '18 at 16:36
  • @MaxBarraclough, No, it's not required by the standard... but it isn't _Undefined Behaviour_, it is _Implementation Dependent Behaviour_, which actually is well defined behaviour, assuming you read the compiler documentation :) – Luis Colorado May 05 '18 at 19:02
1

Quite obviously, shifting a negative quantity has the same effect as multiply/divide by two raised to the power n (for a << n/a >> n where a is signed) This is never undefined behaviour for a negative left operand, and is well and preciselly defined for C/C++.

Well, despite of the fact that the accepted answer has already been selected and the amount of confussion about the << and >> operators this question has generated, I'll try to clarify what's undefined and what is not.

  • << operator behaves equally on signed and unsigned left operand quantities, as the net effect on an integer is to numerically double it, and the process is the same for signed and unsigned quantities, just insert a 0 on the right part of the number, left shifting all the number bits the number indicated as the right operator.

  • >> operator behaves differently on signed and unsigned left operands, as the net effect is, again, like integer division by two raised to the right operand, this means, for a two's complement representation to extend the most signifiant bit (0 on the left becomes 00 and 1 on the left becomes 11) to achieve the net effect of a division by two raised to the power of the right operand (-24 >> 3 becomes -3, as dividing by 2^3, and 24 >> 3 becomes 3) When using unsigned integers, this behaviour changes (0 becomes 00 and 1 becomes 01) making again the shifting like a division by two raised to the power of the right operator.

As the standards says, behaviour is undefined if you try to left/right shift a negative amount of bits (right operator must be > 0) or an amount greater or equal than the left operator's size in bits. But this only affects the right operand, and not the left one.

On trying to get an explanation about JPEG decoding, just try to think that JPEG decoding uses COSIN transformation (which is defined for real vectors) on discrete real approximations encoded as signed quantities and or that, it uses signed integers to evaluate low precision samples, so they never deal with unsigned quantities (the library designers assume that operating on integers is faster than operating on floating point quantities).

NOTE

  • -24 is binary 1111111...1111101000, and right shifting by 3 makes it 11111...11111101, or -3. If we shift it again, it becomes 111...1110 or -2 (-1.5 rounded to minus infinity), and again 1111...1111 or -1.
  • 24 is 00000000....00011000 and right shifting it becomes 000000...0000011 which is 3. If we shift it again, it becomes 000...0001 or 1 (1.5 rounded to minus infinity), and again we get 000...0000 or 0 (0.5 rounded to minus infinity).

(Beware that negative shifts made with this approach truncate up to minus infinity, making -1 >> 1 to become -1, as -0.5 truncates to -1 and not to 0.)

Community
  • 1
  • 1
Luis Colorado
  • 10,974
  • 1
  • 16
  • 31
  • "well and preciselly defined for C/C++" I don't think so. The standards don't require use of 2's complement. The C11 standard says that when doing bitwise operations on signed integer types, the best you can hope for is "values that depend on the internal representations of integers, and thus have implementation-defined and undefined aspects". https://wiki.sei.cmu.edu/confluence/display/c/INT13-C.+Use+bitwise+operators+only+on+unsigned+operands – Max Barraclough May 05 '18 at 11:43
  • @MaxBarraclough, indeed, the standard specifies that all these are implementation defined behaviours, and _must be perfectly defined_ in the compiler's documentation. The wording the standar uses for that is _implementation defined behaviour_, and it explains quite thoroughly what it means. – Luis Colorado May 05 '18 at 19:07