How does the value of the program counter increment?

Question

I have a block of instructions for which I wanted to know how the pc register works. It says on Wikipedia, the pc register holds the value of the next instruction to be executed, however, looking at the disassembly graph of binary ninja for the same block of instructions it seems this is not entirely true.

Here is part of the disassembly graph by binary ninja, where in front of each memory load is written the address in memory from which the load happens.

000080ec         ldr r3, [pc, #76] -> 0x813c = 0x80f0 + 0x4c -> pc = 80f0 ?? (shouldnt it be 80ee).
000080ee         cmp r3, #0
000080f0         it eq
000080f2         ldreq r3, [pc, #68] -> 0x8138 = 0x80f4 + 0x44 -> pc = 80f4 (this makes sense).
000080f4         mov sp, r3
000080f6         sub.w sl, r3, #65536 (edited)

this also happens way down the code not always the pc holds the address of the next instruction to be executed.. is there something I should account for?

If the structure in memory is larger than 1 byte or only the addresses of some structures are known, possibly only the starting address of the structure + an offset is displayed. Can you identify the 0x4c and 0x44 in the binary code and modify them to see how the displayed address changes? — Sebastian, Dec 30 '21 at 12:26
With your used ARM processor the value of PC depends on the ARM/Thumb state and on the specific instruction: https://developer.arm.com/documentation/dui0473/c/Cacdbfji — Sebastian, Dec 30 '21 at 12:30
`pc`, when used in addressing, always points 2 instructions ahead. This is more complicated in an `IT` block. Also note that according to documentation _"any explicit reference to R15 (the PC) in the IT block is deprecated."_ — Jester, Dec 30 '21 at 12:41
@Jester.. i could find other cases where pc doesnt always point 2 instructions ahead and is not in an IT block. case 1: `0x80ba ldr r1, [pc, #0x20] {data_80dc} 0x80bc ldr r0, [pc, #0x20] {data_80e0} 0x80be nop.w) ` , case 2 : `0x8112 ldr r3, [pc, #0x30] {data_8144} 0x8114 cmp r3, #0 0x8116 beq #0x811a `. if 2 instructions ahead is a no operation or is a branch the pc points to the next instruction directly .. but why? — hany erfan, Dec 30 '21 at 13:26
Have you seen that sometimes bit 1 of PC is cleared, mentioned in the posted link? — Sebastian, Dec 30 '21 at 13:32
@Sebastian.. I dont see how the information in the link makes it clear the difference between the pc calculation for both load instructions here ..` 0x8100 ldr r0, [pc, #0x44] {data_8148} 0x8102 ldr r2, [pc, #0x48] {data_814c} 0x8104 subs r2, r2, r0 {0x478} 0x8106 bl #memset` the first ldr the pc points to address 0x8104 which makes sense however for the second ldr instruction the pc points to 0x8104 again.. both instructions look exactly the same. The only difference is the branch instruction happens to be 2 instructions ahead of the 2nd ldr — hany erfan, Dec 30 '21 at 14:45
Clearing bit 1 of PC: 0x8100 & ~2 = 0x8100; 0x8102 & ~2 = 0x8100. 4 bytes are added afterwords according to the link — Sebastian, Dec 30 '21 at 14:54
@Sebastian..thanks now it is clear and i have a formula for calculating the pc that works every time for my case :) — hany erfan, Dec 30 '21 at 15:20
It may be your assembler generating the `it` so you might be confused as to what the actual address of the `ldreq` is, especially if your disassembler also hides the `it`. — Jester, Dec 30 '21 at 15:29
@Jester The disassembler does not hide the `it` instruction. You can see in OPs post that it is clearly there and each instruction takes up as much space as you'd expect. — fuz, Dec 30 '21 at 15:44
*the pc register holds the value of the next instruction to be executed* - That's not how ARM works. It's 2 instructions ahead in ARM mode, or just an arbitrary +4 and word-aligned in Thumb Mode. [Understanding Cortex-M assembly LDR with pc offset](https://stackoverflow.com/a/70492270). I hope Wikipedia doesn't say that in anything specific to ARM; that would be misinformation. — Peter Cordes, Dec 30 '21 at 22:10

fuz · Accepted Answer · 2021-12-30T15:46:05.590

The key thing you are missing is that the value of PC in the Thumb ldr Rd, [Pc, #imm] instruction is aligned to 4 bytes before being used. The slightly abridged pseudo code from the ARMv7 Architecture Reference Manual is:

t = UInt(Rt);
imm32 = ZeroExtend(imm8:’00’, 32);
base = Align(PC,4);
address = base + imm32;
data = MemU[address,4];
R[t] = data;

So to come back to your example:

000080ec         ldr r3, [pc, #76]

We know that PC reads as the current address plus 4 bytes in Thumb mode, so PC reads as 0x80f0. This value is already aligned to 4 bytes, so base has the same value. To this we add 76 (the immediate is always a multiple of four with the two least significant bits not stored) getting 0x813c.

For the second example:

000080f2         ldreq r3, [pc, #68]

This is the same instruction as the ldr above. The disassembler adds an eq suffix to the mnemonic as the instruction is subject to conditional execution by the preceding IT block. This does not affect the instruction encoding in any way though. PC reads as 0x80f6 which is aligned to 4 bytes as 0x80f4. To this we add 68, obtaining 0x8138 as the address to load from.

For further information, refer to the ARM Architecture Reference Manual.

old_timer · Answer 2 · 2021-12-30T14:52:57.893

The thumb encoding of the pc relative ldr instruction was just covered recently here on SO. when you look at the documentation on the instruction set you will as we have pointed out know that the PC from a documentation perspective, is two ahead in the early pre-thumb2 days, but now for thumb it is 4 bytes ahead of the instruction address. The pc offset is encoded in units of words so the address being used is

((instruction address + 4 ) & 0xFFFFFFFC) + (immed<<2)

removing all confusion about the two ahead thing.

The reality is there are multiple program counters, the days of a single program counter used to actual fetch things and do pc relative addressing are a part of history in older, simpler, architectures.

This two ahead thing is part of that past, but for compatibility reasons, has carried on from acorn to the present arm products, just like x86 and others have legacy things that no longer are what they say they are (branch shadow/defer slot in mips).

The pipe is different and one would assume for every different arm product (not architecture but product cortex-m0, cortex-m4, cortex-a7, etc) the pipe implementation and how the core keeps track of things varies. The two ahead is synthesized by some form of a program counter keeping track of the instructions in the pipe. Likewise the fetch/prefetch/branch prediction are all forms of a program counter, but not assumed to be a single program counter. r15 itself is also either real from the register file or fake or both (I would expect not in the register file, why burn those cycles for no value add).

Just like in software you could have a reg[15] array item, a pc_fetch, a pc_current_inst, pc_execution, a pc_possible_branch, a pc_branch_prediction set of variables to keep track of a simulation of a processor, the logic can too. And which one is used at what time depends on what you are doing. What we thing of as programmers as the PC as described in the operation of an instruction is an address that is "two ahead" of the address where the instruction lives. with thumb2 the two ahead no longer makes sense, so for thumb mode it is 4 bytes ahead for arm mode 8 bytes ahead of the instruction address. And then you follow the documentation to understand how that PC is used during execution of the instruction.

For BX and other instructions capable of mode switching the definition of that address which becomes the "program counter" is different, the lsbit drives the mode to switch into (and is stripped off by the branch it does not live in the program counter, there is a psr bit to take care of that). These addresses are also a sort of form of program counter as well that temporarily at least is the actual address of the instruction to branch to and not two ahead.

In a lot of early processor implementations where you had one or the idea of one program counter, you fetched, decoded and executed one instruction at a time before going on to the next (does not mean people no longer do those designs, you can make small and efficient little controllers the old fashioned way and people still do and the are in products we use). In that case the pc is used to fetch the instruction, which may be more than one byte, once the instruction is completely fetched then the program counter points to, at least for the moment, the next instruction. The execution of that instruction can now begin since fetch and decode have completed. If the program counter is used as an input to that instruction then it is pointing to the next instruction, if used as a destination in a jump or branch then it is modified and after completion the next fetch happens wherever it happens to be pointing. Many of these architectures were variable length instruction sets so, one instruction may be one, two, three... bytes long so the pc address relative to the instruction address at execution time, varied. The early arm comes from a pipeline type solution with fixed sized instructions, so if you had a single program counter, then, depending on the pipe design, if you use a textbook style one, then execution is at a fixed depth in the pipe meaning the program counter is fetching that many ahead when you execute.

How does the value of the program counter increment?

2 Answers2