4

In 64-bit Java, each object instance tends to include a 192-bit header, containing

  • the class pointer,
  • flags and
  • locks (64 bits each).

This can cause a large memory overhead for small objects.

Is the situation similar for Nim? Would a large application (where the size of the runtime is negligible), written similarly in the two languages, use about the same amount of memory?


Update:

I ran some experiments in which I build a naive singly-linked list with 100M float64 elements and iterate through it many times.

Java is actually using 25% less memory than Nim, according to htop.

Complete Nim code:

type Node = ref object
    data : float64
    next : Node

echo "Running"

proc create(n : int): Node =
    var 
        tmp = Node(data: 0, next: nil)

    for i in 1..<n:
        tmp = Node(data: i.float64, next: tmp)

    return tmp


proc sum(x: Node): float64 = 
    var
        tmp: float64 = 0
        y = x

    while true:
        tmp += y.data
        y = y.next

        if y.isNil:
            return tmp


proc sums(x: Node, n: int): float64 = 
    var tmp: float64 = 0

    for i in 0..<n:
        tmp += sum(x) / n.float64

    return tmp

let x = create(1000 * 1000 * 100)

echo "Created"

echo sums(x, 100)

echo "Finished"

This uses 3.1GB, which comes out to 269 bits per Node, whereas Java uses 203 bits per Node in very similar code. This is less than 192bit header + 128bit structure. I'm guessing that some kind of JIT optimization makes Java run using less memory.

Complete Java code:

Node.java

public class Node {
    double data = 0;
    Node next = null;
}

SListTest.java

public class SListTest {

    static Node create(int n) {
        Node tmp = new Node();

        for(int i = 1; i < n; ++i) {
            Node p = new Node();
            p.data = i;
            p.next = tmp;
            tmp = p;
        }

        return tmp;
    }

    static double sum(Node x) {
        double tmp = 0;

        while(x != null) {
            tmp += x.data;
            x = x.next;
        }

        return tmp;
    }

    static double sums(Node x, int n) {
        double tmp = 0;

        for(int i = 0; i < n; ++i)
            tmp += sum(x);

        return tmp / n;
    }

    public static void echo(String s) {
        System.out.println(s);
        System.out.flush();
    }

    public static void main(String[] args) {
        echo("Started");
        Node p = create(1000 * 1000 * 100);
        echo("Created");
        double tmp = sums(p, 100);
        System.out.printf("%f\n", tmp);
        echo("Finished");
    }

}
MWB
  • 11,740
  • 6
  • 46
  • 91

2 Answers2

4

For both Nim and HotSpot (note that not all Java implementations need to use the same approach), basic allocations require one word for GC information and one word for type information (Nim, HotSpot). In HotSpot, you can reduce the type information to a half word on 64-bit machines if you don't need more than 32GB of heap space using -XX:+UseCompressedOops.

Modern locking implementations in Java do not incur additional overhead; the GC word is also used for a thin locking scheme and is expanded into a pointer to a full monitor if needed. Thus, by default, we have two words of overhead per object.

In your example, the per object memory consumption is four words in Nim on a 64-bit machine:

  • Two words of header data.
  • One word for the float value.
  • One word for the next pointer.

1e8 allocations of this size require a raw amount 32*1e8 = 3.2e9 bytes, i.e. about 3 GB.

I'll add that having lots of small allocations tends to be bad for memory locality not even counting the costs of that many allocations and should generally be avoided if possible. A (dynamic) array is almost always preferable over a linked list.

Reimer Behrends
  • 8,600
  • 15
  • 19
  • As I mentioned, in my test, Java actually uses only 3 words per `Node` (`java --version` says "openjdk 10.0.2") – MWB Jan 06 '19 at 03:19
  • I run the Java version with `taskset -c 0 java SListTest` – MWB Jan 06 '19 at 03:20
  • It takes >3GB for me with Java on both macOS and Linux, though I'm not using OpenJDK 10 yet. Either they changed something more recently (note that I linked to the OpenJDK 9 source code) or you are running the 32-bit version of the JVM (which would take 5 32-bit words, but might also round up to 6 32-bit words to keep data 64-bit aligned). – Reimer Behrends Jan 06 '19 at 06:34
  • "OpenJDK 64-Bit Server VM (build 10.0.2+13-Ubuntu-1ubuntu0.18.04.4, mixed mode)" – MWB Jan 06 '19 at 08:06
  • Come to think of it, all `Node`s are the same. They can share the header and just need a pointer to it. I would guess that this is what's happening here. – MWB Jan 06 '19 at 08:07
3

In Nim you can also put objects onto the stack, thus not requiring garbage collection and only taking as much space as the members in the object. When putting an automatically allocated object on the heap some garbage collection memory overhead exists, but the object itself still stays at the size of its members (plus padding of course).

def-
  • 5,275
  • 20
  • 18