Simple algorithm to determine when to free some memory .Net

Question

Our system keeps hold of lots of large objects for performance. However, when running low on memory, we want to drop some of the objects. The objects are prioritized, so I know which ones to drop. Is there a simple way of determining when to free memory? Also, dropping 1 object may not be enough, so I guess I need a loop to drop, check, drop again if necessary, etc. But in c#, I won't necessarily see the effect immediately of dropping an object, so how do I avoid kicking too much stuff out?

I guess it's just a simple function of used vs total physical & virtual memory. But what function?

Edit: Some clarifications

"Large objects" was misleading. I meant logical "package" of objects (the objects should be small enough individually to avoid the LOB - that's the intention certainly) that together are large (~ 100MB?)
A request can come in which requires the use of one such package. If it is in memory, the response is rapid. If not, it needs to be reconstructed, which is very slow. So I want to keep stuff in memory as long as possible, but can ditch the least requested ones when necessary.
We have no sensible way to serialize these packages. We should probably do that, but it's a lot of work and there's a lot of resistance to doing so.

Our original simple approach is to periodically compare the following to a configurable threshold.

var c = new ComputerInfo();
return c.AvailablePhysicalMemory / c.TotalPhysicalMemory;

I think in .Net, memory garbage is done automatically by garbage collector. Right? — Monika, Nov 15 '13 at 13:02
@Monika not if you want to be able to reach them while memory is plentiful — Marc Gravell, Nov 15 '13 at 13:06

Jorge Córdoba · Accepted Answer · 2013-11-15T14:25:20.477

There're a lot of different topics on this questions and I think is best to clarify them before actually answering.

First of, you say your app does get a hold of a lot of "large objects". Define large object. Anything larger than about 85K goes into the LOH which only gets collected as part of a generation 2 collection (the most expensive of them all), anything smaller than that, even if you think is a "big" object, is not and it's treated as any other kind of object.

Secondly there're two problems in terms of "managing memory"

One is managing the amount of space you're using inside your virtual memory space. That is, in 32 bit systems making sure you can address all the memory you're asking for, which in Windows 32 bit uses to be around 1,5 GB.
Secondly is managing disposing of that memory when it's needed, which is a part of the garbage collector work so that it triggers when there's a shortage on memory (although that doesn't mean you can't get an OutOfMemoryException if you don't give the GC time enough to do its job).

With that said, I think you should forget about taking the place of the GC... just let it do its job and, if you're worried then find the critical paths that may fail (on memory request) and protect yourself against OutOfMemoryExceptions.

There're a lot of different patterns for handling the case you're posting and most of them really depend on your business scenario. One example is having a state machine that can actually go to an "OutOfMemory" state, in which case the system switches to freeing memory before doing anything else (that includes disposing old objects and invoking the GC to clean everything up, all while you patiently wait for it to happen).

Other techniques involve saving the data to the disk and then manually swapping in and out objects based on some algorithm when you reach certain levels. That means stopping all your threads (or some, depending on business) and moving the data back and forth.

If your large objects are all controlled in terms of location you can also declare a facade over their creation, so that the facade can check whether it needs to free objects or not based on the amount of memory (virtual memory) your process is using. BTW, use the PerformanceInfo API call as quoted in the other answer as this will include the amount of memory used by unmanaged code, which is, nonetheless, located inside the virtual memory space of your process.

Don't worry too much about "real" memory, as the operating system will make sure the most appropriate pages are located in memory.

Then there're hundreds of other optimizations that completely depend on your business scenario. For example databases "know" to bring data to memory depending on the query and predicting the data you're going to use in advance so the data is ready and they do remove objects that are not used... but that's another topic.

Edit: Based on your edits to the question.

Checking memory in the facade will not add a significant overhead in terms of performance.
If you start getting low on memory you should take a decision of how many objects / how much space are you going to free. Don't do it one at a time, take a bunch of them and free enough memory so that you don't have to collect again.
If you go with the previous approach you can service the request after you've freed enough space and continue cleaning in background.
One of the fastest ways of handling memory / disk swapping is by using memory mapped files.

I added some clarification to the question. Checking memory on creation (in the "facade") seems like a good idea. If I check only virtual memory, will I be at risk of excessive paging? — Rob, Nov 15 '13 at 14:18
Never used memory mapped files, but aren't they only useful if I can serialize my objects? — Rob, Nov 15 '13 at 14:38
No, memory mapped files is a different thing altogether. Serializing is sort of a transformation between class <--> stream. Mapping the file to memory in that case won't help you as after you've mapped the file you'll still have to deserialize it to memory (hence using twice the required memory). I was talking more of using memory mapped files to hold the big data (i.e. the byte array or the long string or whatever you're using). If that's not your case then memory mapped files are not a good idea. — Jorge Córdoba, Nov 15 '13 at 14:42
Just so I understand. If I have a big string or other simple type that I would like to keep in memory for fast access, but may need to push to disk under memory pressure, I can use a memory mapped file and have the CLR/OS take care of it all for me? — Rob, Nov 15 '13 at 14:47
Yes, that's more or less the idea. You have a persistent memory mapped file and, when you're running out of space you just disopose and set it to null in your object. Then the GC will take care of getting rid of that when it has the time. When you're going to access that particular object, if the reference is null then you recreate the memory mapped file from the file (after making sure you have the needed space) — Jorge Córdoba, Nov 15 '13 at 14:57

Sameer · Answer 2 · 2013-11-15T13:19:05.143

0

Use GC.GetTotalMemory and if this exceeds your expectation then you can nullify the objects that you want to release and call GC.Collect.

edited Nov 15 '13 at 13:19

answered Nov 15 '13 at 13:09

Sameer

3,124
5
30
57

2

The question isn't "how" to do it, but "when" to do it. And by "function", Rob means an algorithm, not the `GC.Collect` method. – dcastro Nov 15 '13 at 13:13
Still you didn't answer "When to do it"? – Sriram Sakthivel Nov 15 '13 at 13:19

score 0 · Answer 3 · edited May 23 '17 at 11:50

Have a look at the accepted answer to this question. It uses the GetPerformanceInfo Windows API to determine memory consumption of all sorts. Task Manager is using the same information. This should help you writing a class that observes memory consumption periodically.

Once memory runs low you can fill a FIFO queue with soon-to-be deleted tasks. The observer will delete the first object in the queue and maybe call GCCollect manually, I'm not too sure about this. Give the collection some time before you recheck the mem consumption for your application. If there is still not enough free mem, delete the next object from the queue and so on...

Simple algorithm to determine when to free some memory .Net

3 Answers3