How does linking a node to one another works in assembly language when it comes to binary tree?

Question

I've already done BST in C language so I do know the implementation. But when it comes to asembly language (more specifically MIPS) how are the nodes mapped with respect to the memory location ? Because unlike C you do need to specify the pointer location and the NULL pointer right ?

If we were given a list of 4 numbers {1,3,2,4} and the starting memory location is 2000h will the tree mapping be

    1 (2000h)
      /    \
Null(mem?)  3 (2001h)
            /     \
       2 (2002h)   4 (2003h)

Will deletion and insertion be the same procedure as C ? Also extending to AVL tree is tree rotation the same procedure as given in C language too?

Same way it does in C: storing a pointer into a memory next to other data and pointers. [How do objects work in x86 at the assembly level?](https://stackoverflow.com/q/33556511). If you're curious, try compiling some C code and looking at the asm output: [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116) — Peter Cordes, Jan 23 '21 at 19:51

score 1 · Accepted Answer · answered Jan 23 '21 at 14:24

First, let's differentiate between what the processor does aka machine code, and assembly language, since assembly languages can vary dramatically in what they can do, whereas the machine code is pretty primitive.

A node would be a struct in C. In machine code, the processor see structs, it sees constants like offsets and sizes. Let's take the following struct:

typedef struct Node *NodePtr;
typedef struct Node {
    int value;
    NodePtr leftChild, rightChild;
} Node

In the C language, a type determines a variables size. Assuming a 32-bit machine, an int will take 4 bytes, and pointers will also take 4 bytes. Thus, the size of one instance of Node is 12 bytes, and, the value field is at offset 0 from the beginning of such an instance, while leftChild is a offset 4, and rightChild at offset 8.

(Some assembly languages allow computing offsets of structs, giving access to named constants for the 0, 4, and 8 here, but the processor won't see names, only numbers.)

how are the nodes mapped with respect to the memory location ?

Memory locations are choses as either global storage, local storage, or heap storage, in both C and assembly language. The mechanism to take the address of global storage or local storage is processor specific, but heap allocation is done using malloc or other just like in C. In machine code, taking the address of a local variable involves computing that variables offset from the stack reference, where in C we just put & in front of a variables name.

Because unlike C you do need to specify the pointer location and the NULL pointer right ?

Just like in C, to initialize a Node we need to initialize each field. The value field with an element of {1,2,3,4}, and, the leftChild with perhaps NULL, while the rightChild with the address of some other Node. The machine code to do this will reference offset 0, offset 4, and offset 8 of the chosen storage.

We need to know the address value of NULL, but it is almost always the value 0 of the appropriate pointer size.

Will deletion and insertion be the same procedure as C ? Also extending to AVL tree is tree rotation the same procedure as given in C language too?

Yes, these operations are conceptually the same. It is the field accesses / pointer dereferencing, the if-then and whiles that requires translation to their assembly equivalents.

Ah thanks, so your saying the pointers have to jump to 12 bytes everytime if we wanna access the nodes ? — shane dawson, Jan 23 '21 at 19:13
No, you should do the same operations as in C: `malloc`, write to pointer variable/field, read pointer variable/field, dereference variable/field. If you are skipping 12 bytes at a time in C then do that in assembly, but if you're using pointer dereferencing instead of pointer arithmetic then do that in assembly, too. — Erik Eidt, Jan 23 '21 at 19:17
If you `malloc` 12 bytes, there will potentially be some overhead, so 2 `malloc(12)`'s in a row won't necessarily produce objects whose pointers are 12 bytes apart. We cannot count on any relationship between the return pointer of two different malloc's (except inequality). — Erik Eidt, Jan 23 '21 at 19:19
However, if you `malloc ( 100*12 )` then you can fit 100 nodes into that space, using pointer arithmetic to get to successive object. This in C and assembly as well. — Erik Eidt, Jan 23 '21 at 19:20
C & assembly are just not that different, there is a fairly direct mapping of C constructs to assembly constructs. One of the main differences in assembly is that (because there is no real type information) pointer arithmetic is not automatically scaled, so scaling has to explicit. Otherwise, most operations, assignment to variable or field, dereference of variable or field map directly into equivalent assembly sequences. — Erik Eidt, Jan 23 '21 at 19:22

How does linking a node to one another works in assembly language when it comes to binary tree?

1 Answers1