I don't have Kiel handy but doesn't really matter, you didn't provide enough information (what is your target architecture/core) and not all of this is well documented by arm.
So generic thumb
.thumb
LDR R5, PAaddr
MOV R6, #0x55
STR R6, [R5]
.align
PAaddr: .word 0x400043FC
Disassembly of section .text:
00000000 <PAaddr-0x8>:
0: 4d01 ldr r5, [pc, #4] ; (8 <PAaddr>)
2: 2655 movs r6, #85 ; 0x55
4: 602e str r6, [r5, #0]
6: 46c0 nop ; (mov r8, r8)
00000008 <PAaddr>:
8: 400043fc .word 0x400043fc
The immediate offset added to the Align(PC, 4) value of the instruction to form the address. Permitted values are multiples of four in the range 0-1020 for encoding T1.
So ALIGN(0x00+2,4) = 0x04. 0x08 - 4 = 4 = one word. So 1 word 0x4D01 the 01 is the immediate.
.thumb
nop
LDR R5, PAaddr
MOV R6, #0x55
STR R6, [R5]
.align
PAaddr: .word 0x400043FC
00000000 <PAaddr-0x8>:
0: 46c0 nop ; (mov r8, r8)
2: 4d01 ldr r5, [pc, #4] ; (8 <PAaddr>)
4: 2655 movs r6, #85 ; 0x55
6: 602e str r6, [r5, #0]
00000008 <PAaddr>:
8: 400043fc .word 0x400043fc
ALIGN(0x02+2,4) = 0x4. 0x08 - 0x04 = 0x04, one word 0x4D01 encoding.
.cpu cortex-m3
.thumb
LDR R5, PAaddr
MOV R6, #0x55
STR R6, [R5]
.align
PAaddr: .word 0x400043FC
Disassembly of section .text:
00000000 <PAaddr-0x8>:
0: 4d01 ldr r5, [pc, #4] ; (8 <PAaddr>)
2: 2655 movs r6, #85 ; 0x55
4: 602e str r6, [r5, #0]
6: bf00 nop
00000008 <PAaddr>:
8: 400043fc .word 0x400043fc
No change, but
.cpu cortex-m3
.syntax unified
.thumb
LDR R5, PAaddr
MOV R6, #0x55
STR R6, [R5]
.align
PAaddr: .word 0x400043FC
Disassembly of section .text:
00000000 <PAaddr-0x8>:
0: 4d01 ldr r5, [pc, #4] ; (8 <PAaddr>)
2: f04f 0655 mov.w r6, #85 ; 0x55
6: 602e str r6, [r5, #0]
00000008 <PAaddr>:
8: 400043fc .word 0x400043fc
and
.cpu cortex-m3
.syntax unified
.thumb
nop
LDR R5, PAaddr
MOV R6, #0x55
STR R6, [R5]
.align
PAaddr: .word 0x400043FC
Disassembly of section .text:
00000000 <PAaddr-0xc>:
0: bf00 nop
2: 4d02 ldr r5, [pc, #8] ; (c <PAaddr>)
4: f04f 0655 mov.w r6, #85 ; 0x55
8: 602e str r6, [r5, #0]
a: bf00 nop
0000000c <PAaddr>:
c: 400043fc .word 0x400043fc
ALIGN(0x02+2,4) = 0x04. 0x0C-0x04 = 0x08, 2 words, 0x4D02 encoding.
You can do the same things with Kiel's assembly language vs gnu shown above.
It's not your job to count unless you are writing your own assembler (or trying to create your own machine code for some other reason).
In any case simply read the ARM architecture documentation for the architecture in question. Compare that to the output of a debugged assembler for further clarification as needed.
Edit
From the early/original ARM ARM
address = (PC[31:2] << 2) + (immed_8 * 4)
Rd = Memory[address, 4]
this one makes more sense IMO.
When in doubt go back to the old/original-ish ARM ARM.
Most(ish) recent ARM ARM
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(15);
base = Align(PC,4);
address = if add then (base + imm32) else (base - imm32);
data = MemU[address,4];
if t == 15 then
if address<1:0> == ‘00’ then LoadWritePC(data); else UNPREDICTABLE;
elsif UnalignedSupport() || address<1:0> == ‘00’ then
R[t] = data;
else // Can only apply before ARMv7
if CurrentInstrSet() == InstrSet_ARM then
R[t] = ROR(data, 8*UInt(address<1:0>));
else
R[t] = bits(32) UNKNOWN;
But that covers T1, T2 and A1 encodings in one shot, making it the most confusing.
In any case, they describe what is going on with the encoding as well as overall size of each of the instructions.