-1

I am trying to understand how the SHL instruction sets the OF and CF flag, for the most part I understand how it happens but one particular example I can't wrap my mind around.

Consider the following case:

; RAX == 0x18F2086424981487
SHL     RAX, 0x22

After this the flag status is the following:

OF == 0, CF == 0

Ill explain my logic when approaching this.

CF is always the last bit pushed out of destination so in that case 0x18F2086424981487 << 0x21 has a most significant bit of 0 meaning that on the next shift the 0 will be pushed out, setting CF to 0, so far so good.

Regarding OF (from SHL reference):

The OF flag is affected only on 1-bit shifts. For left shifts, the OF flag is set to 0 if the most-significant bit of the result is the same as the CF flag (that is, the top two bits of the original operand were the same); otherwise, it is set to 1.

From this I understand that if CF == 0 and the most significant bit of the result (here 0x18F2086424981487 << 0x22 so the most significant bit is 1) are equal then OF = 0. But in this case CF == 0 and the most significant bit is 1 therefore CF = 1.

Where have I gone wrong?

EDIT:

I have read the OF is only affected on 1-bit shifts but how do you explain:

; RAX == 0xABECFF1297783575
SHL     RAX, 0x78

Outputting:

CF == 1, OF == 1

From my understanding the SHL instruction is just treated as a loop of value << 1 until the counter is exhausted, meaning that OF will be set for the last shift.

papadp
  • 891
  • 8
  • 14
  • You cited it... *"The OF flag is affected only on 1-bit shifts."* :) – Ped7g Dec 28 '16 at 11:27
  • 2
    @Ped7g As is often the case in the manual, that statement alone is incomplete or ambiguous and can be incorrectly understood as if OF only changes when the shift count is 1. However, the pseudo code that follows the text explicitly says that OF is undefined when the shift count is larger than 1. – Alexey Frunze Dec 28 '16 at 11:29
  • @Ped7g Please see the edit – papadp Dec 28 '16 at 11:38
  • @AlexeyFrunze Please see the edit – papadp Dec 28 '16 at 11:38
  • 3
    The manual guarantees that x86 CPUs will set OF to either 0 or 1, but doesn't guarantee which when the shift count is > 1. You happened to find that it was 1 in this case on your CPU. AFAIK, there's nothing interesting to say beyond that. In some cases where the manual says "undefined", the actual behaviour is interesting and/or useful (e.g. [for BSR](http://stackoverflow.com/questions/41351564/vs-unexpected-optimization-behavior-with-bitscanreverse64-intrinsic)). But like I said, IDK what the actual OF behaviour is, or if it's useful for anything. – Peter Cordes Dec 28 '16 at 11:48
  • @papadp See the updated answer. – Alexey Frunze Dec 28 '16 at 11:52
  • *From my understanding ... meaning that OF will be set for the last shift.* That's contradicted by the manual in several places. The pseudocode Operation section, and the Flags Affected section, both explicitly state that it's undefined for counts > 1. – Peter Cordes Dec 28 '16 at 11:53
  • I wouldn't expect the shift does loop per bit for > 1 arguments, with 64 bits it may be way too slow approach, it's highly likely taking some shortcuts. – Ped7g Dec 28 '16 at 11:53
  • @Ped7g: obviously it wouldn't really be implemented that way, but it could still give results that matched that logical behaviour (as-if rule and all that). That's one way to describe how CF works (the "last bit shifted out"), and I assumed that's what the OP meant. – Peter Cordes Dec 28 '16 at 11:56

1 Answers1

2

OF is undefined if the shift count is larger than 1 (see the pseudo code). 0x22 > 1. Ditto for 0x78 (0x78 mod 0x40 > 1).

UPD:

Please try to understand that undefined means that OF may become anything (stay unchanged or change).

If you imply that the CPU does not set OF to a random value when the shift count is larger than 1 then you're absolutely right. It does not. However, the actual formula for OF in this case is not documented. You can deduce it from experimentation. But it's not something you can rely on. As a matter of fact, I have done this experiment before and found 3 different formulas for different CPUs (AMD, intel with hyperthreading, intel without hyperthreading, AFAIR). What do you do with this?

Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180