What are near, far and huge pointers?

Question

Can anyone explain to me these pointers with a suitable example ... and when these pointers are used?

they're only relevant on 16-bit intel platforms; which are obsolete. I pity you if you need to maintain code or write new code on this platform. — Philip Potter, Aug 26 '10 at 13:40
"What is near, far, and [has] huge pointers?" That's a hard riddle, I'd say ;) — Piskvor left the building, Aug 26 '10 at 13:42
http://www.amolbagde.110mb.com/cracktheinterview/pages/6_16.html — DumbCoder, Aug 26 '10 at 13:50
If this is a homework question, then that school *really* needs to get new computers for its CS classroom. — dan04, Aug 26 '10 at 16:31
'they're only relevant on 16-bit intel platforms' erm, what about embedded platforms, DSPs etc? Can't believe everybody here is saying this is old and obsolete, please do some research first — stijn, Oct 08 '10 at 09:13

Mike DeSimone · Answer 1 · 2017-12-06T04:29:46.043

The primary example is the Intel X86 architecture.

The Intel 8086 was, internally, a 16-bit processor: all of its registers were 16 bits wide. However, the address bus was 20 bits wide (1 MiB). This meant that you couldn't hold an entire address in a register, limiting you to the first 64 kiB.

Intel's solution was to create 16-bit "segment registers" whose contents would be shifted left four bits and added to the address. For example:

DS ("Data Segment") register:  1234 h
DX ("D eXtended") register:   + 5678h
                              ------
Actual address read:           179B8h

This created the concept of 64 kiB segment. Thus a "near" pointer would just be the contents of the DX register (5678h), and would be invalid unless the DS register was already set correctly, while a "far" pointer was 32 bits (12345678h, DS followed by DX) and would always work (but was slower since you had to load two registers and then restore the DS register when done).

(As supercat notes below, an offset to DX that overflowed would "roll over" before being added to DS to get the final address. This allowed 16-bit offsets to access any address in the 64 kiB segment, not just the part that was ± 32 kiB from where DX pointed, as is done in other architectures with 16-bit relative offset addressing in some instructions.)

However, note that you could have two "far" pointers that are different values but point to the same address. For example, the far pointer 100079B8h points to the same place as 12345678h. Thus, pointer comparison on far pointers was an invalid operation: the pointers could differ, but still point to the same place.

This was where I decided that Macs (with Motorola 68000 processors at the time) weren't so bad after all, so I missed out on huge pointers. IIRC, they were just far pointers that guaranteed that all the overlapping bits in the segment registers were 0's, as in the second example.

Motorola didn't have this problem with their 6800 series of processors, since they were limited to 64 kiB, When they created the 68000 architecture, they went straight to 32 bit registers, and thus never had need for near, far, or huge pointers. (Instead, their problem was that only the bottom 24 bits of the address actually mattered, so some programmers (notoriously Apple) would use the high 8 bits as "pointer flags", causing problems when address buses expanded to 32 bits (4 GiB).)

Linus Torvalds just held out until the 80386, which offered a "protected mode" where the addresses were 32 bits, and the segment registers were the high half of the address, and no addition was needed, and wrote Linux from the outset to use protected mode only, no weird segment stuff, and that's why you don't have near and far pointer support in Linux (and why no company designing a new architecture will ever go back to them if they want Linux support). And they ate Robin's minstrels, and there was much rejoicing. (Yay...)

Interestingly, many 68000-based programs ended up with 32K limitations in places where 8088 software would have had 64K limitations. Given that memory was often scarce, and that the 8088 could get by with 16-bit quantities in places where the 68000 would require either using 32-bit quantities or accepting 32K limits, the 8088 was actually a remarkably practical design. — supercat, Dec 01 '17 at 23:47
The "32k limit" for the 68k was when you used a 16-bit relative offset for branches and jumps. If you make no attempt to order the functions in the code segment to try to keep things together, you'd just give up and slap a 32k limit on everything when you could well have had a 64k limit. It's still less of a disaster than the Intel system where different-valued pointers could actually be equal. Again, it's all history now, and even embedded systems have no problem doing 32-bit addressing right these days. — Mike DeSimone, Dec 05 '17 at 16:12
On the classic Macintosh, many data structures were limited to 32K because the OS opted to use 16-bit types for many purposes rather than 32-bit types; I would expect the same would be true with a lot of memory-limited software for the 68000 platform (especially the 16-bit-bus variants). The 68000's addr+disp16 instructions all sign-extend the displacement to allow both positive and negative displacement. The 8088, however, can allow a 16-bit displacement to reach +/-65535 bytes within a single object, rather than +/-32767 bytes, if the offset fits within a segment. — supercat, Dec 05 '17 at 18:25
The classic Mac limits trace back to the insane things they did to pack things into 128K total RAM, such as use the (then-unused) upper 8 bits in system-owned pointers as flags, and the fact that `integer` in Pascal was 2 bytes. (On the flip side, Pascal strings gave you O(1) string length computation, the ability to have nulls in strings, and a near total absence of buffer overflow errors.) The x86 series let offsets "wrap around" in a segment, but at the cost of sticking you with segment registers and near and far pointers. — Mike DeSimone, Dec 06 '17 at 04:24
Memory isn't free, and even today a lot of applications will run faster when compiled for 32-bit x86 than when compiled for x64, despite the fact that 64-bit mode has a larger register set. The only significant performance downside I can see for 64-bit mode is that 64-bit object references gobble twice as much cache as 32-bit ones. The 8086 really needs a couple more segment registers, but even then I'd say it does a better job of addressing 1MiB of address space than any other 16-bit architecture before or since (the M68K is a 32-bit architecture). — supercat, Dec 06 '17 at 15:26
"where the addresses were 32 bits, and the segment registers were the high half of the address," This is incorrect. 32-bit addresses are used with 32-bit offsets held in 32-bit registers. These are technically "near" pointers because the selector is still used, it just points to a flat 4 GiB descriptor usually. The selector registers do not hold "the high half of the address". — ecm, Oct 25 '19 at 11:01

score 24 · Answer 2 · edited Nov 06 '13 at 14:33

Difference between far and huge pointers:

As we know by default the pointers are near for example: int *p is a near pointer. Size of near pointer is 2 bytes in case of 16 bit compiler. And we already know very well size varies compiler to compiler; they only store the offset of the address the pointer it is referencing. An address consisting of only an offset has a range of 0 - 64K bytes.

Far and huge pointers:

Far and huge pointers have a size of 4 bytes. They store both the segment and the offset of the address the pointer is referencing. Then what is the difference between them?

Limitation of far pointer:

We cannot change or modify the segment address of given far address by applying any arithmetic operation on it. That is by using arithmetic operator we cannot jump from one segment to other segment.

If you will increment the far address beyond the maximum value of its offset address instead of incrementing segment address it will repeat its offset address in cyclic order. This is also called wrapping, i.e. if offset is 0xffff and we add 1 then it is 0x0000 and similarly if we decrease 0x0000 by 1 then it is 0xffff and remember there is no change in the segment.

Now I am going to compare huge and far pointers :

1.When a far pointer is incremented or decremented ONLY the offset of the pointer is actually incremented or decremented but in case of huge pointer both segment and offset value will change.

Consider the following Example, taken from HERE :

 int main()
    {
    char far* f=(char far*)0x0000ffff;
    printf("%Fp",f+0x1);
    return 0;
  }

then the output is:

0000:0000

There is no change in segment value.

And in case of huge Pointers :

int main()
{
char huge* h=(char huge*)0x0000000f;
printf("%Fp",h+0x1);
return 0;
}

The Output is:

0001:0000

This is because of increment operation not only offset value but segment value also change.That means segment will not change in case of far pointers but in case of huge pointer, it can move from one segment to another .

2.When relational operators are used on far pointers only the offsets are compared.In other words relational operators will only work on far pointers if the segment values of the pointers being compared are the same. And in case of huge this will not happen, actually comparison of absolute addresses takes place.Let us understand with the help of an example of far pointer :

int main()
{
char far * p=(char far*)0x12340001;
char far* p1=(char far*)0x12300041;
if(p==p1)
printf("same");
else
printf("different");
return 0;
}

Output:

different

In huge pointer :

int main()
{
char huge * p=(char huge*)0x12340001;
char huge* p1=(char huge*)0x12300041;
if(p==p1)
printf("same");
else
printf("different");
return 0;
}

Output:

same

Explanation: As we see the absolute address for both p and p1 is 12341 (1234*10+1 or 1230*10+41) but they are not considered equal in 1st case because in case of far pointers only offsets are compared i.e. it will check whether 0001==0041. Which is false.

And in case of huge pointers, the comparison operation is performed on absolute addresses that are equal.

A far pointer is never normalized but a huge pointer is normalized . A normalized pointer is one that has as much of the address as possible in the segment, meaning that the offset is never larger than 15.

suppose if we have 0x1234:1234 then the normalized form of it is 0x1357:0004(absolute address is 13574). A huge pointer is normalized only when some arithmetic operation is performed on it, and not normalized during assignment.
```
 int main()
 {
  char huge* h=(char huge*)0x12341234;
  char huge* h1=(char huge*)0x12341234;
  printf("h=%Fp\nh1=%Fp",h,h1+0x1);
  return 0;
 }
```
Output:
```
h=1234:1234

h1=1357:0005
```
Explanation:huge pointer is not normalized in case of assignment.But if an arithmetic operation is performed on it, it will be normalized.So, h is 1234:1234 and h1 is 1357:0005which is normalized.

4.The offset of huge pointer is less than 16 because of normalization and not so in case of far pointers.

lets take an example to understand what I want to say :
```
 int main()
  {
  char far* f=(char far*)0x0000000f;
  printf("%Fp",f+0x1);
  return 0;
  }
```

Output:

    0000:0010

In case of huge pointer :

      int main()
      {
      char huge* h=(char huge*)0x0000000f;
        printf("%Fp",h+0x1);
        return 0;
        }

        Output:
        0001:0000

Explanation:as we increment far pointer by 1 it will be 0000:0010.And as we increment huge pointer by 1 then it will be 0001:0000 because it's offset cant be greater than 15 in other words it will be normalized.

+1 for content.Make sure to format your answer properly next time. — joey rohan, Nov 06 '13 at 14:30
This is missing a mention of the HMA and how it interacts with the huge pointer normalisation. (NB I do not know how compilers treat this, I just know that when normalising a pointer that you want to point into the HMA, you must allow offsets larger than 15.) — ecm, Oct 25 '19 at 11:06

score 15 · Accepted Answer · answered Aug 26 '10 at 13:40

15

In the old days, according to the Turbo C manual, a near pointer was merely 16 bits when your entire code and data fit in the one segment. A far pointer was composed of a segment as well as an offset but no normalisation was performed. And a huge pointer was automatically normalised. Two far pointers could conceivably point to the same location in memory but be different whereas the normalised huge pointers pointing to the same memory location would always be equal.

answered Aug 26 '10 at 13:40

PP.

10,764
7
45
59

thanks .. bt can u tell me example .. so they are usble in present also . – Aug 26 '10 at 13:44
9

@Vishwanath: No, they aren't really usable for new code. They were only for 16 bit Intel platforms, which were obsoleted a long time ago (I believe the Intel 386 was the first chip to support the 32 bit flat memory model effectively). If you're writing code that has to care about this, you are writing legacy code. – Billy ONeal Aug 26 '10 at 13:50
@BillyONeal What if, I want to know about them(how or where they are used, back in those days of 16-bit machines)? – Shubham Jul 07 '20 at 06:53
2

@Lucas Best way is to get hold of a 1980s era PC programming book --- all old-school DOS programming used these. Try looking up Turbo C or Pacific C. – David Given Nov 06 '20 at 14:30

score 3 · Answer 4 · answered Aug 26 '10 at 14:04

All of the stuff in this answer is relevant only to the old 8086 and 80286 segmented memory model.

near: a 16 bit pointer that can address any byte in a 64k segment

far: a 32 bit pointer that contains a segment and an offset. Note that because segments can overlap, two different far pointers can point to the same address.

huge: a 32 bit pointer in which the segment is "normalised" so that no two far pointers point to the same address unless they have the same value.

tee: a drink with jam and bread.

That will bring us back to doh oh oh oh

and when these pointers are used?

in the 1980's and 90' until 32 bit Windows became ubiquitous,

^ ^ at last got something in brief ! – joey rohan Oct 08 '13 at 18:27 — joey rohan, Oct 08 '13 at 18:27

score 3 · Answer 5 · answered Aug 26 '10 at 16:18

In some architectures, a pointer which can point to every object in the system will be larger and slower to work with than one which can point to a useful subset of things. Many people have given answers related to the 16-bit x86 architecture. Various types of pointers were common on 16-bit systems, though near/fear distinctions could reappear in 64-bit systems, depending upon how they're implemented (I wouldn't be surprised if many development systems go to 64-bit pointers for everything, despite the fact that in many cases that will be very wasteful).

In many programs, it's pretty easy to subdivide memory usage into two categories: small things which together total up to a fairly small amount of stuff (64K or 4GB) but will be accessed often, and larger things which may total up to much larger quantity, but which need not be accessed so often. When an application needs to work with part of an object in the "large things" area, it copies that part to the "small things" area, works with it, and if necessary writes it back.

Some programmers gripe at having to distinguish between "near" and "far" memory, but in many cases making such distinctions can allow compilers to produce much better code.

(note: Even on many 32-bit systems, certain areas of memory can be accessed directly without extra instructions, while other areas cannot. If, for example, on a 68000 or an ARM, one keeps a register pointing at global variable storage, it will be possible to directly load any variable within the first 32K (68000) or 2K (ARM) of that register. Fetching a variable stored elsewhere will require an extra instruction to compute the address. Placing more frequently-used variables in the preferred regions and letting the compiler know would allow for more efficient code generation.

What does it mean that huge pointer was automatically normalised while far pointer isn't ? What does the word **"normalize"** mean here? — Destructor, Apr 04 '16 at 16:52
@Destructor: Each pointer on the 8086 has two 16-bit parts--a segment which is awkward to manipulate, and an offset which can be manipulated much more conveniently. Hardware addresses are taken by multiplying the segment by 16 and adding the offset. Objects up to 65536 bytes which are aligned on 16-byte boundaries may be easily manipulated by setting the segment so it identifies the start of the object and then using the offset to access locations within it, but the fact that each location can be identified 4096 different ways can sometimes be problematical. — supercat, Apr 04 '16 at 17:16
@Destructor: Normalizing a pointer means replacing it with a pointer that identifies the same physical location but has an offset in the range 0-15. For some usage patterns, relational operators and pointer arithmetic that ignore the segment altogether will comply with the C Standard and will work, but would imply that one may have two different pointers whose difference is zero, and neither of which is greater than the other, but which are nonetheless unequal and which access different things. Having relational operators treat pointers as 32-bit values (with segment as the upper word)... — supercat, Apr 04 '16 at 17:23
...will work for many usage patterns, but may still trip up in some scenarios which try to test for pointer overlap. I wish the Standard would define intrinsics to test whether two pointers "might" overlap, or whether they definitely overlap, since the former could be determined more cheaply than the latter but for many cases would be just as useful. — supercat, Apr 04 '16 at 17:25

score 2 · Answer 6 · answered Aug 26 '10 at 13:52

This terminology was used in 16 bit architectures.

In 16 bit systems, data was partitioned into 64Kb segments. Each loadable module (program file, dynamically loaded library etc) had an associated data segment - which could store up to 64Kb of data only.

A NEAR pointer was a pointer with 16 bit storage, and referred to data (only) in the current modules data segment.

16bit programs that had more than 64Kb of data as a requirement could access special allocators that would return a FAR pointer - which was a data segment id in the upper 16 bits, and a pointer into that data segment, in the lower 16 bits.

Yet larger programs would want to deal with more than 64Kb of contiguous data. A HUGE pointer looks exactly like a far pointer - it has 32bit storage - but the allocator has taken care to arrange a range of data segments, with consecutive IDs, so that by simply incrementing the data segment selector the next 64Kb chunk of data can be reached.

The underlying C and C++ language standards never really recognized these concepts officially in their memory models - all pointers in a C or C++ program are supposed to be the same size. So the NEAR, FAR and HUGE attributes were extensions provided by the various compiler vendors.

What are near, far and huge pointers?

6 Answers6

Linked

Related