6

Is it possible to use a structure with a pointer to another structure inside a memory mapped file instead of storing the offset in some integral type and calculate the pointer?

e.g. given following struct:

typedef struct _myStruct_t {
  int number;
  struct _myStruct_t *next;
} myStruct_t;
myStruct_t* first = (myStruct_t*)mapViewHandle;
myStruct_t* next = first->next;

instead of this:

typedef struct _myStruct_t {
  int number;
  int next;
} myStruct_t;
myStruct_t* first = (myStruct_t*)mappedFileHandle;
myStruct_t* next = (myStruct_t*)(mappedFileHandle+first->next);

I read about '__based' keyword, but this is Microsoft specific and therefore Windows-bound.

Looking for something working with GCC compiler.

RaphaelH
  • 2,144
  • 2
  • 30
  • 43
  • What is the problem? What have you tried? What errors do you get? The linked list node structure in the first snippet is perfectly normal. A structure in a file sounds a little iffy, so I think you should be a little bit more explicit in what it is that you're trying to achieve. – Henrik Jun 12 '14 at 10:18
  • @Henrik: There is nothing to try cause I see no way of doing this. Of course I can assign next to the memory mapped base with some offset and use it but as soon as I restart the application the pointer value stored in next is not valid anymore cause the base address of the mapped memory region changes. That's why I use the second snippet. So the question basically is if gcc does support something which could generate this in any way dynamically. It should store the offset instead of the current pointer value. – RaphaelH Jun 12 '14 at 10:32
  • I don't know of any way for GCC to help you with this, but assuming you're using `mmap` to create the mapping, maybe using the `MAP_FIXED` flag and specifying a fixed base address might help you. – Henrik Jun 12 '14 at 10:45
  • @Henrik: Nice idea but "it will cause mmap to unmap anything that may already be mapped at that address which is generally a very bad thing" see http://stackoverflow.com/questions/6446101/how-do-i-choose-a-fixed-address-for-mmap – RaphaelH Jun 12 '14 at 10:55

3 Answers3

2

I'm pretty sure there's nothing akin to the __based pointer from Visual Studio in GCC. The only time I'd seen anything like that built-in was on some pretty odd hardware. The Visual Studio extension provides an address translation layer around all operations involving the pointer.

So it sounds like you're into roll-your-own territory; although I'm willing to be told otherwise.

The last time I was dealing with something like this it was on the palm platform, where, unless you locked down memory, there was the possibility of it being moved around. You got memory handles from allocations and you had to MemHandleLock before you used it, and MemPtrUnlock it after you were finished using it so the block could be moved around by the OS (which seemed to happen on ARM based palm devices).

If you're insistent on storing pointer-esque values in a memory mapped structure the first recommendation would be to store the value in an intptr_t, which is an int size that can contain a pointer value. While your offsets are unlikely to exceed 4GB, it pays to stay safe.

That said, this is probably easy to implement in C++ using a template class, it's just that marking the question as C makes things a lot messier.

Anya Shenanigans
  • 91,618
  • 3
  • 107
  • 122
  • I don't think that `intptr_t` is a good idea for a file format: it is platform dependent. Worse, it depends on whether the program is compiled as 32 or 64 bit. If you use `intptr_t`, a file written by a 32 bit app cannot be read by a 64 bit app and vice versa. I would opt for `uint64_t`. – cmaster - reinstate monica Jun 12 '14 at 13:28
  • @cmaster all things being equal, that's the least of the problems that would exist with the file. Everything in the structure would have to be coded for size, padding, alignment and capacity if it was to be portable across 32/64bit and endianness if it was to be portable across hardware platforms – Anya Shenanigans Jun 12 '14 at 13:36
1

C++: It is very doable and portable (the code, but maybe not the data). It was a while ago, but I created a template for a self-relative pointer classes. I had tree structures inside blocks of memory that might move. Internally, the class had a single intptr_t, but = * . -> operators were overloaded so it appeared like a regular pointer. Handling null took some attention. I also did versions using int, short and not very useful char for space-saving pointers that were unable to point far away (outside memory block).

In C you could use macros to wrap get and set

// typedef OBJ { int p; } OBJ;
#define OBJPTR(P) ((OBJ*)((P)?(int)&(P)+(P):0))
#define SETOBJPTR(P,V) ((P)=(V)?(int)(V)-(int)&(P):0)

The above C macros are for self-relative pointers that can be slightly more efficient than based pointers. Here is a working example of a tree in a small block of relocatable memory using 2-byte (short) pointers to save space. int is okay for casting from pointers since it is 32 bit code:

#include <stdio.h>
#include <memory.h>

typedef struct OBJ
{
  int val;
  short left;
  short right;
#define OBJPTR(P) ((OBJ*)((P)?(int)&(P)+(P):0))
#define SETOBJPTR(P,V) ((P)=(V)?(int)(V)-(int)&(P):0)  
} OBJ;

typedef struct HEAD
{
  short top; // top of tree
  short available; // index of next available place in data block
  char data[0x7FFF]; // put whole tree here
} HEAD;

HEAD * blk;

OBJ * Add(int val)
{
  short * where = &blk->top; // find pointer to "pointer" to place new node
  OBJ * nd;
  while ( ( nd = OBJPTR(*where) ) != 0 )
    where = val < nd->val ? &nd->left : &nd->right;
  nd = (OBJ*) ( blk->data + blk->available ); // allocate node
  blk->available += sizeof(OBJ); // finish allocation
  nd->val = val;
  nd->left = nd->right = 0;
  SETOBJPTR( *where, nd );
  return nd;
}

void Dump(OBJ*top,int indent)
{
  if ( ! top ) return;
  Dump( OBJPTR(top->left), indent + 3 );
  printf( "%*s %d\n", indent, "", top->val );
  Dump( OBJPTR(top->right), indent + 3 );
}

void main(int argc,char*argv)
{
  blk = (HEAD*) malloc(sizeof(HEAD));
  blk->available = (int) &blk->data - (int) blk;
  blk->top = 0;
  Add(23); Add(2); Add(45); Add(99); Add(0); Add(12);
  Dump( OBJPTR(blk->top), 3 );
  { // PROOF a copy at a different address still has the tree:
  HEAD blk2 = *blk;
  Dump( OBJPTR(blk2.top), 3 );
  }
}

A note about based verses self-relative "*" operator. Based can involve 2 addresses and 2 memory fetches. Self-relative involves 1 address and 1 memory fetch. Pseudo assembly:

load reg1,address of pointer
load reg2,fetch reg1
add reg3,reg2+reg1

load reg1,address of pointer
load reg2,fetch reg1
load reg3,address of base
load reg4,fetch base
add reg5,reg2+reg4
Codemeister
  • 107
  • 1
  • 6
0

The first is extremely unlikely to work.

Remember that a pointer, such as struct _myStruct_t * is a pointer to a location in memory. Suppose that this structure was located at address 1000 in memory: that would mean that the next structure, located just after it, might be located at address 1008, and that's what's stored in ->next (the numbers don't matter; what matters is that they are memory addresses). Now you save that structure to a file (or un-map it). Then you map it again, but this time, it ends up starting at address 2000, but the ->next pointer is still 1008.

You have (generally) no control over where files are mapped in memory, so no control over the actual memory locations of the elements within the mapped structure. Therefore you can only depend on relative offsets.

Note that your second version may or may not work as you expect, depending on the declared type of mappedFileHandle. If it's a pointer to myStruct_t, then adding an integer n to it will produce a pointer to an address which is n*sizeof(myStruct_t) bytes higher in memory (as opposed to being n bytes higher).

If you declared mappedFileHandle as

myStruct_t* mappedFileHandle;

then you can subscript it like an array. If the mapped file is laid out as a sequence of myStruct_t blocks, and the next field refers to other blocks by index within that sequence, then (supposing myStruct_t* b is a block of interest)

mappedFileHandle[b->next].number

is the number field of the b->nextth block in the sequence.

(This is just a consequence of the way that arrays are defined in C: mappedFileHandle[b->next] is defined to be equivalent to *(mappedFileHandle + b->next), which is an object of type myStruct_t, which you can therefore get the number field of).

Norman Gray
  • 11,978
  • 2
  • 33
  • 56
  • Your note has nothing to do with the question and I am aware that the + operation on a pointer type will add n times the size of the type. With an array I'm bound to having my structures in sequence which is not what I want to have. This is not a real answer, rather an explanation of the problem, which I thought is clear.. – RaphaelH Jun 12 '14 at 10:53
  • Then it's not completely clear what your question is. You know that `mmap` maps at an unpredictable address, so any absolute pointers stored in the mapped file will necessarily be wrong. `MAP_FIXED` may allow you to play games with the mapping address, but that's probably precarious. If you want to do this sort of thing, then managing a pool yourself and storing the offsets from the start is probably the only way. Is that what you're asking? I don't think the compiler can realistically help out here. – Norman Gray Jun 12 '14 at 19:27
  • The question is, if there's any way I can use pointers (like in first example) with a mapped file, so if the pointers are written to file, it writes the offsets instead of runtime addresses. – RaphaelH Jun 14 '14 at 11:21
  • 2
    Ah, right: No, I don't think that's possible in any sort of portable way. Looking at [the Windows docs for __based](http://msdn.microsoft.com/en-us/library/sbw639kb.aspx) I see what the goal is, but I'm not _aware_ of anything analogous in GCC, and certainly nothing portable to other compilers. I'm afraid it's DIY, and some self-managed pool of structs+offsets in an array.... – Norman Gray Jun 14 '14 at 13:06