2

I am a little confused with the use of colon in x86 assembly. I know that in real mode %gs:0x14 will be the address of %gs shift 4 bits left and adds with 0x14. But it is the same in protected mode? For example in protected mode,

movl %gs:0x14 %eax 

In what way %gs:0x14 is accessed? It is like 0x14(%gs) or same as in real mode?

Update: to make my question clearer, assume %gs = 0x1234 what is the value of %eax after instruction movl %gs:0x14 %eax.

Further information:

Just found this document useful for the function of gs and fs in different system http://www.akkadia.org/drepper/tls.pdf

And this link provide information about segment:offset address.

http://thestarman.pcministry.com/asm/debug/Segments.html

AuA
  • 481
  • 1
  • 5
  • 11
  • Possible duplicate of [What does the colon : mean in x86 assembly GAS syntax as in %ds:(%bx)?](http://stackoverflow.com/questions/18736663/what-does-the-colon-mean-in-x86-assembly-gas-syntax-as-in-dsbx) – Ciro Santilli OurBigBook.com Nov 07 '15 at 09:12

2 Answers2

1

You need to carefully read the application binary interface specification for your architecture (probably x86-64), i.e. X86-64 ABI.

You'll find out that %gs is related to thread local storage. See this answer.

So your machine instruction is probably loading the word at offset 0x14 of the current TLS.

Community
  • 1
  • 1
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • There is no explanation on this issue in the document. I understand that gs is used as a interface for thread local storage, but I just don't know how it is accessed in %gs:0x14. – AuA Dec 21 '13 at 12:33
1

First, let's deal with the terms. Seems you're using "protected mode" in general, as opposed to real mode. But, at least in Intel manuals, this term is applicable only for 32 bits mode. For 64 bits mode they use a poorly marketing term "IA-32e mode", which is horrible compared to "long mode" by AMD, but both are still hiding the fact that 64-bit mode is also protected one.

This difference is important because dealing with %gs is different for 32- and 64-bit protected mode. For 32 bits it's yet another segment register. A thread switching code fills it with a segment base for the current thread in the same virtual space, so, unlike {CS,DS,ES,SS} it's base isn't zero in a flat mode. For 64 bits, it's just a offset kept in a processor MSR and also changed by scheduler to the current thread TLS address. (Details can differ between Linux/*BSD/Windows/etc. which of %fs and %gs is used for what role.) But, as a common result, when see an access like %gs:0x14 you should realize that

  • GS base address is got (using, as explained above, a generic method for 32 bits and special MSR-based handling for 64 bits nvironment)
  • 0x14 is added to this address

and that's all you need to know unless you develop kernel or another deeply system thing as e.g. Wine.

Netch
  • 4,171
  • 1
  • 19
  • 31
  • Great answer. But isn't that intel 64-bit structure called IA-64? And AMD64/x86-64 is compatible with real, protected and long mode. Protected is just only used in 32-bit system. – AuA Dec 21 '13 at 14:35
  • 1
    @AuA IA-64 is totally another architecture, without any %gs. When Intel realized IA-64 had been failing, they had stolen and rebranded AMD64 as IA-32e, EM64T, and finally "Intel 64". Say thanks to Intel for this mess. Long mode (IA-32e mode in Intel terms) is protected in sense it has page-level protection at virtual memory level, privileged commands, etc., but segmentation is disabled, despite it still shall be configured; and it has 32- and 64-bit submodes, but the 32-bit submode differs from the old protected mode. Yep, all this is real headache for system programmers. – Netch Dec 21 '13 at 15:12