The instruction that will be executed next is that at memory address equal to:
16 * CS + IP
This allows 20 bits of memory to be addressed, despite registers being only 16 bits wide (and it also creates two distinct ways to encode most of the addresses).
The effect of CS is analogous to that of the other segment registers. E.g., DS
increments data accesses (that don't specify another segment register) by 16 * DS
.
CS
The instructions that modify CS are:
- ljmp (far jump)
- lcall (far call), which pushes ip and cs to the stack, and then far jumps
- lref (far return), which inverses the far call
- int, which reads IP / CS from the Interrupt Vector Table
- iret, which reverse an int
CS cannot me modified by mov
like the other segment registers. Trying to encode it with the standard identifier for CS, which GNU GAS 2.24 does without complaining if you write:
mov %ax, %cs
leads to an invalid code exception when executed.
To observe the effect of CS, try adding the following to a boot sector and running it in QEMU as explained here https://stackoverflow.com/a/32483545/895245
/* $1 is the new CS, $1f the new IP. */
ljmp $1, $after1
after1:
/* Skip 16 bytes to make up for the CS == 1. */
.skip 0x10
mov %cs, %ax
/* cs == 1 */
ljmp $2, $after2
after2:
.skip 0x20
mov %cs, %ax
/* cs == 2 */
IP
IP increases automatically whenever an instruction is executed by the length of the encoding of that instruction: this is why the program moves forward!
IP is modified by the same instructions that modify CS, and by the non-far versions of those instructions as well (more common case).
IP cannot be observed directly, so it is harder to play with it. Check this question for alternatives:
Reading Program Counter directly