7

I will have literally tens of millions of instances of some class MyClass and want to minimize its memory size. The question of measuring how much space an object takes in the memory was discussed in Find out the size of a .net object I decided to follow Jon Skeet's suggestion, and this is my code:

   // Edit: This line is "dangerous and foolish" :-) 
   // (However, commenting it does not change the result)
   // [StructLayout(LayoutKind.Sequential, Pack = 1)]
   public class MyClass       
   {
      public bool isit;
      public MyClass nextRight;
      public MyClass nextDown;
   }

   class Program
   {
      static void Main(string[] args)
      {
         var a1 = new MyClass(); //to prevent JIT code mangling the result (Skeet)
         var before = GC.GetTotalMemory(true);   
         MyClass[] arr = new MyClass[10000];
         for (int i = 0; i < 10000; i++)
            arr[i] = new MyClass(); 

         var after = GC.GetTotalMemory(true);

         var per = (after - before) / 10000.0;
         Console.WriteLine("Before: {0} After: {1} Per: {2}", before, after, per);
         Console.ReadLine();
      }
   }

I run the program on 64 bit Windows, Choose "release", platform target: "any cpu", and choose "optimize code" (The options only matter if I explicitly target x86) The result is, sadly, 48 bytes per instance.

My calculation would be 8 bytes per reference, plus 1 byte for bool plus some ~8byte overhead. What is going on? Is this a conspiracy to keep RAM prices high and/or let non-Microsoft code bloat? Well, ok, I guess my real question is: what am I doing wrong, or how can I minimize the size of MyClass?

Edit: I apologize for being sloppy in my question, I edited a couple of identifier names. My concrete and immediate concern was to build a "2-dim linked-list" as a sparse boolean matrice implementation, where I can get an enumeration of set values in a given row/column easily. [Of course that means I have to also store the x,y coordinates on the class, which makes my idea even less feasible]

Community
  • 1
  • 1
Ali Ferhat
  • 2,511
  • 17
  • 24
  • 6
    Wait, you are packing on one-byte boundaries and you have a one-byte struct *before a bunch of references*? **Do you have any idea how dangerous and foolish that is?** The performance and correctness of the runtime on many architectures is predicated on pointers being aligned correctly! Volatility and atomicity semantics in particular are *completely* dependent upon alignment being correct. If the runtime is actually following your instructions and misaligning those pointers by a byte then you are going to have big problems. – Eric Lippert Jan 17 '12 at 16:03
  • I think we'd have to see your `MyClass` code to help on that question. – AllenG Jan 17 '12 at 16:03
  • Mr.Lippert, no I'don't have any idea, please correct me, I don't care about the layout, I was only trying to reduce the size. – Ali Ferhat Jan 17 '12 at 16:08
  • @AllenG : this is pretty much the implementation except there will be properties not fields. I'm experimenting with the idea of using a "2-dim linked-list" for a sparse boolean matrice implementation. – Ali Ferhat Jan 17 '12 at 16:11
  • It would only have saved space if you had them in an array and they were value types. They'd still have the issues @EricLippert pointed out, and if perhaps more. And of course, it wouldn't work having a reference to their own type as fields any more. – Jon Hanna Jan 17 '12 at 16:29

3 Answers3

27

Approach the problem from the other end. Rather than asking yourself "how can I make this data structure smaller and still have tens of millions of them allocated?" ask yourself "how can I represent this data using a completely different data structure that is far more compact?"

It looks like you are building a doubly-linked list of bools, which, as you note, uses thirty to fifty times more memory than it needs to. Is there some reason why you're not simply using a BitArray to store your list of bools?

UPDATE:

in fact I was trying to implement a sparse boolean two-dimensional matrix

Well why didn't you say so in the first place?

When I want to make a sparse Boolean two-d matrix of enormous size, I build an immutable persistent boolean quadtree with a memoized factory. If the array is sparse, or even if it is dense but self-similar in some way, you can achieve enormous compressions. Square arrays of 264 x 264 Booleans are easily representable even though obviously as a real array, that would be more memory than exists in the world.

I have been toying with the idea of doing a series of blog articles on this technique; I will likely do so in late March. (UPDATE: I did not write that article in March 2012; I wrote it in August 2020. https://ericlippert.com/2020/08/17/life-part-32/)

Briefly, the idea is to make an abstract class Quad that has two subclasses: Single, and Multi. "Single" is a doubleton -- like a singleton, but with exactly two instances, called True and False. A Multi is a Quad that has four sub-quads, called NorthEast, SouthEast, SouthWest and NorthWest.

Each Quad has an integer "level"; the level of a Single is zero, and a multi of level n is required to have all of its children be Quads of level n-1.

The Multi factory is memoized; when you ask it to make a new Multi with four children, it consults a cache to see if it has made it before. If it has, it does not construct a new one; it hands out the old one. Since Quads are immutable, you do not have to worry about someone changing the Quad on you after it is in the cache.

Consider now how many memory words (a word is 4 or 8 bytes depending on architecture) an "all false" Multi of level n consumes. A level 1 "all false" multi consumes four words for the links to its children, a word for the level count (if necessary; you are not required to keep the level in the multi, though it helps for debugging) and a couple words for the sync block and so on. Let's call it eight words. (Plus the memory for the False Single quad, which we can assume is a constant two or three words, and thereby may be ignored.)

A level 2 "all false" multi consumes the same eight words, but each of its four children is the same level 1 multi. Therefore the total consumption of the level 2 "all false" multi is let's say 16 words.

The same for the level 3, 4,... and so on. The total memory consumption for a level 64 multi that is logically a 264 x 264 square array of Booleans is only 64 x 16 memory words!

Make sense? Hopefully that is enough of a sketch to get you going. If not, see my blog link above.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • I edited my question, in fact I was trying to implement a sparse boolean matrice. I can use a list of BitArray's I guess, but I my implementation would make it easier to get all set values in a given column for instance. [And of course then I have to keep int x and y coordinates on the class as well, now I really feel bad] – Ali Ferhat Jan 17 '12 at 16:27
  • And "how can I get the effect of tens of millions, without tens of millions", as per Charles' answer. – Jon Hanna Jan 17 '12 at 16:31
  • @Ali: Consider using an immutable persistent Boolean quadtree for a large sparse Boolean matrix. I've updated my answer to give a sketch. – Eric Lippert Jan 17 '12 at 16:46
  • Will your blog be about hashlife? – kvb Jan 17 '12 at 23:16
  • Excellent! What an incredible algorithm. – kvb Jan 17 '12 at 23:58
  • @kvb: Gosper's Algorithm for Life is one of my favourites, indeed. One of those algorithms that made me re-evaluate how I approach problems, actually. – Eric Lippert Jan 18 '12 at 00:00
  • Very clever, thank you. The only tricky part is memoization: how to check whether given [NE, NW, SE, SW], there's already such a Multi. Best guess: 1. Hash of a Multi depends only on its immediate children, store the value within when created, 2. Use a seperate Dictionary List> for each level, compute the uncreated Multi's hash beforehand, check if the corresponding list contains the candidate, act accordingly, 3. Choose hash func. carefully, 4. If it is the case that HashCode==0 iff every child is all false, it would be great too. This should keep me happy until March :) – Ali Ferhat Jan 18 '12 at 11:37
  • 2
    Only 8 years later and finally the blog posts are here! :-) – kvb Aug 27 '20 at 15:20
5

8 (object reference) + 8 (object reference) + 1 (bool) + 16 (header) + 8 (reference in array itself) = 41

Even if it's misaligned internally, each will be aligned on the heap. So we're looking at least 48bytes.

I can't for the life of me see why you'd want a linked list of bools though. A list of them would take 48times less space, and that's before you get to optimisations of storing a bool per bit which would make it 384 times smaller. And easier to manipulate.

Jon Hanna
  • 110,372
  • 10
  • 146
  • 251
1

If these hundreds of millions of instances of the class are mostly copies of the class with minor variations in class property values, then your system is a prime candidate to use what is called the Flyweight pattern. This pattern minimizes memory use by using the same instanes over and over, and just changing the properties as needed...

Charles Bretana
  • 143,358
  • 22
  • 150
  • 216
  • Flyweight won't work if they've them all in the same doubly-linked list, which they seem to. Flyweight is good advice in general though, so +1. – Jon Hanna Jan 17 '12 at 16:19