3

How do I know if the bytes in the word represent a 16-bit instruction or a 32-bit instruction ?
I referred the ARM ARMv7M and I am not clear how to distinguish if it a 16-bit instruction or a 32-bit instruction.
It says
If bits [15:11] of the halfword being decoded take any of the following values, the halfword is the first halfword of a 32-bit instruction:
• 0b11101 • 0b11110 • 0b11111. Otherwise, the halfword is a 16-bit instruction

Does it mean that the processor always fetches halfwords, examines them and decides if it's 16 or 32-bit ?
What does the first halfword mean ? Bit[31-16] or Bit[15-0] in a word ?

If I have 32-bits then can I know if it's a 32-bit instruction or a 16-bit instruction ?

Thanks.

artless noise
  • 21,212
  • 6
  • 68
  • 105
Uchia Itachi
  • 5,287
  • 2
  • 23
  • 26

1 Answers1

6

In Thumb, "32-bit" instructions are still composed of two separate halfwords, so the "first halfword" is the first halfword of the encoding, which says nothing about the layout in memory. Thumb instructions are halfword-aligned, so any given word of memory could hold two 16-bit instructions, a 16-bit instruction and one half of a 32-bit instruction, two halves of two different 32-bit instructions, or one whole 32-bit instruction.

Conceptually, the processor decodes one halfword at a time, thus if it sees one of the above bit patterns, it knows it needs to also decode the next halfword before it can actually execute this instruction. Reality complicates this somewhat since the Cortex-M3/M4 only ever actually fetch whole 32-bit words from memory, so the correlation between the number of "instruction fetches" and the number of instructions actually decoded and executed depends on the code itself. Just imagine that those fetches are to refill a 4-byte buffer that the pipeline slurps individual halfwords out of (which may not be all that far off the truth, for all I know).

So, if you have a halfword containing one of those values in its top bits, then you know it's the first half of a 32-bit encoding, and you need to interpret it in conjunction with the next halfword. Conversely, if you have a halfword with any other value in its top bits, then it's either a 16-bit encoding, or the second half of a 32-bit encoding, depending on what the previous halfword was.

Note that instructions are always little-endian, so the actual in-memory layout of a 32-bit encoding looks like this, where address A is an even number:

          --------------------------------
address A | bits 7:0 of first halfword   |
          --------------------------------
      A+1 | bits 15:8 of first halfword  |
          --------------------------------
      A+2 | bits 7:0 of second halfword  |
          --------------------------------
      A+3 | bits 15:8 of second halfword |
          --------------------------------
Notlikethat
  • 20,095
  • 3
  • 40
  • 77
  • What about Thumb-2? This mode has both 16-bit and 32-bit long instructions. Is the same rule applies to tell if it's 16 or 32 bit long instruction, if we are in Thumb-2? Also, which architectures have Thumb mode and which have Thumb-2? – Sam Protsenko Mar 04 '15 at 22:08
  • 2
    @Sam Thumb-2 _is_ 32-bit Thumb encodings [(along with other stuff)](http://stackoverflow.com/q/28669905/3156750), there is no "Thumb-2 mode". Note that not all architectures since ARMv6T2 support all Thumb-2 features when in Thumb state - e.g. v6M has some 32-bit encodings, but doesn't support conditional execution. – Notlikethat Mar 04 '15 at 22:27
  • Thanks for the information. In my case I know that in the 32bits that I have there is definitely either a 16-bit or 32-bit instruction, then I would have to examine the bits[31-16] of the word to know the encoding, correct ? – Uchia Itachi Mar 05 '15 at 02:41
  • @UchiaItachi It depends on the _exact_ format of your word - e.g. if it does contain a 16-bit encoding, what are the _other_ 16 bits, and what order are they all in? I would say "look at whichever bits you'd expect to find the 16-bit encoding in if it was one", but I can already imagine certain cases where that wouldn't be right. – Notlikethat Mar 05 '15 at 22:35
  • Following is my understanding. I would examine the bits[31-16] and check if its a 32-bit instruction. If it's not a 32-bit instruction then bits[15-0] is a 16-bit instruction and the other half word could be either a 16-bit instruction or lower half word of a 32-bit instruction. But if its 16-bit then I am done with it and I don't need what the other half word is. If its not then o know it's a 32-bit instruction. – Uchia Itachi Mar 06 '15 at 02:11
  • I have one more question. When the processor is decoding half words, and sees the above pattern for 32-bit instruction, then the previous half word is the other half of 32-bit instruction. Not the next half word right? Sonit would have to take the previous halfword. I am nit sure but this contradicts with your answer. Because of the little endianess. – Uchia Itachi Mar 06 '15 at 02:17
  • In true ARM fashion the numberings are a bit confusing thumb and thumb-2, but if you dig you find thumbv1, thumbv2, thumbv3, etc. which are better since thumb2 covers a whole array of different instruction sets (one building upon the other by adding more instructions). A couple of notes, understand that as mentioned, the first 16 bit instruction is decoded to determine if a second 16 bits is needed. Undefined thumbv1 instructions were used for this. Also note that not cortex-m but bigger arms fetch like 8 words at a time so again trying to count fetches is a different topic. – old_timer Mar 23 '15 at 20:41