0

To improve the computation speed, I need to store a large amount of data in a tree. Some nodes should contain the mean properties of the points inside and some other should not, such as the typical light node only contains the address to the child nodes and the typical regular node, the two address plus some information. I am looking for a smart way to store these two kinds of nodes in the same tree, using the least memory.

I am using c. I first built up my tree using structures (node.properties, node.Left, nodeRight, etc.), but it requires too much memory (if you're familiar with structures, you'll probably figure out why), so I have to go back to some kind of "in-line" organization of my tree, storing the address of the child nodes in the first two slots and (eventually) the rest of the info in the following slots.

So I though about two ways of doing it. One way is to predict the total number of nodes (easy) and allocate sufficient memory to store all nodes as regular nodes (2 slots for the child node addresses and 1 slot for the information). Another way of doing it is to reallocate the tree array as we go down the tree.

First question: is it really a good thing to extensively use realloc() in term of efficiency?

Second question: is there a smarter way (in term of memory) of doing it?

Waribiki
  • 113
  • 5
  • 1
    the trick is selecting your reallocation algorithm to avoid it as much as possible. the conventional wisdom is to start it at an arbitrary small size, and each time you need to realloc() it, double the amount of memory allocated. say you start with 1K, when it fills resize to 2K, then to 4K, then to 8K, etc. – Not_a_Golfer Mar 10 '12 at 23:27
  • I heard once that modern implementations of `malloc` organise memory in arenas for different sizes, and `realloc` really interferes with that by forcing areas of the wrong size into the wrong arena... – Kerrek SB Mar 10 '12 at 23:27
  • 1
    @KerrekSB I would hope that modern libraries accounts for this, considering that realloc it permitted to alter the location... –  Mar 10 '12 at 23:47
  • You say 'uses too much memory'; how do you determine that it is too much memory? What are the constraints on your use of memory? With machines having multiple gigabytes of main memory, you have to have vast trees to run out of virtual memory. If the problem is that pointers are too big (8 bytes each on a 64-bit machine), consider using indexes or offsets instead, maybe even 16-bit, 2-byte indexes. If you know how many nodes you're likely to need, this is pretty easy. – Jonathan Leffler Mar 11 '12 at 01:14
  • Let's say 10,000,000 objects. Each of them has about 11 (double) quantities. So this is 10,000,000 x 11 x 8 bytes so roughly 1GB for the data alone. The number of nodes is about twice as more and you have to add the two pointers to the child nodes plus some more information (number of points, size). This is a big tree and it's crucial to keep the use of space as low as possible...I was thinking of using indices, like in the first two slots of memory. But it has to be in double, so I fear that converting the indices to integer might slow down the algorithm. – Waribiki Mar 11 '12 at 02:07
  • [quote] so I fear that converting the indices to integer might slow down the algorithm[/quote] So when do integer arithmetic operations slow down the algoritthm. Are you referring to promotion (promoting an int to double)? I guess I am missing something very important, a priori, -- still I would not agree. – jim mcnamara Mar 11 '12 at 02:48
  • My idea was to store everything in double and do a cast for the indices: tree[(long)tree(i)], where "i" would contain the index to the left (or right) child node, so I was wondering if this operation: (long)variable_double_type would take time or not? – Waribiki Mar 11 '12 at 08:35
  • I understand that the tree is heterogenous and/or the nodes can have different sizes ? In any case the solution is to use an array to store the nodes, and using indexes instead of pointers, and store the "payload" in a separate array(s). Like Jonathan Leffler said. – wildplasser Mar 11 '12 at 12:30

2 Answers2

0

OK I found a compromise, I think. The size of the tree is determined first and an array of pointers with size 2 X the number of nodes is allocated. Each nodes will have (1) [a pointer to] an array of integer with the index of the left child, the index of the right child and the number of points, and (2) [a pointer to] an array of doubles containing all the useful information. As the routine builds up the tree, the space allocated for the "double" array will be computed according to the type of the node.

So far it's the "cheapest" solution I could come up with. But if you can convince me the conversion (long)index_in_double is fast, one may only use double in the array.

Waribiki
  • 113
  • 5
0

To illustrate the point of of using indexes instead of pointers, see this hash table implementation which uses (short) int indexes, saving a few precious bits. The tiny_find() function compiles to this assembly (GCC -O2):

        .globl  tiny_find
        .type   tiny_find, @function
tiny_find:
.LFB57:
        .cfi_startproc
        imull   $98765, %edi, %ecx
        movl    $274877907, %edx
        movl    %ecx, %eax
        mull    %edx
        movl    $-1, %eax
        shrl    $6, %edx
        imull   $1000, %edx, %edx
        subl    %edx, %ecx
        movzwl  %cx, %ecx
        movzwl  table+4(,%rcx,8), %edx
        cmpw    $-1, %dx
        je      .L12
        movzwl  %dx, %eax
        testb   $1, table+7(,%rax,8)
        jne     .L19
        jmp     .L13
        .p2align 4,,10
        .p2align 3
.L21:
        movzwl  table+4(,%rax,8), %eax
        cmpw    $-1, %ax
        je      .L20
        movzwl  %ax, %eax
        testb   $1, table+7(,%rax,8)
        je      .L13
.L19:
        cmpl    table(,%rax,8), %edi
        jne     .L21
.L13:
        movzbl  table+6(,%rax,8), %eax
.L12:
        rep
        ret
        .p2align 4,,10
        .p2align 3
.L20:
        movl    $-1, %eax
        ret
        .cfi_endproc
.LFE57:
        .size   tiny_find, .-tiny_find

The inner loop (walking the linked list) is the part between L21 and L13. I wouldn't say the code is inefficient.

Community
  • 1
  • 1
wildplasser
  • 43,142
  • 8
  • 66
  • 109