0

I have a .NET 4.0 program developed in Visual Studio 2012 that migrates data from a legacy system into a new system using a dataset for the legacy system data and Entity Framework for the new system. I have been having some serious memory consumption/memory leak issues with the program, particularly with large data sets, so I am using windbg to try to get to the bottom of the issue.

The way the program works basically is that it reads a large volume of DataRows from multiple legacy system access databases and stores rows of the same type in a List(of T). Next various procedures iterate through those lists and create new entities that get saved into the new system. After reading in all datarows into a given list using the GetData() method of the TableAdapter, I close the connection to the Access database. After parsing all items in the List(of T), I clear the list, and set it to Nothing. For saving to the new database via EntityFramework, I batch records up so that I am submitting changes for 100 records at a time, and after each submit I dispose, nullify and reinstantiate the EF ObjectContext.

After doing GC.Collect(), GC.WaitFullGCCollect() and GC.Collect() again, I took a memory dump through VisualStudio while debugging, after all records have been parsed and imported, when the process was showing upwards of 700mb memory being consumed in task manager, and then loaded that dump into windbg.

When I issue the !dumpheap -stat command, I get a long list of objects and their number of instances and memory consumption. Of note, is at the bottom of this list, which shows the objects that are using the most memory. Here is a snippet of the bottom of that list:

67ac8034       28      1233489 System.Boolean[]
025b0fd4    24829      1787688 Importer.dsIBET+tblTKNumberRow
67ad3a70     2840     16078424 System.Int32[]
67ac0f78        4     18874416 System.Double[]
67adfcd8       18     19136728 System.Decimal[]
67add67c       33     19657100 System.DateTime[]
101a04cc      508     25441232 System.Data.RBTree`1+Node[[System.Int32, mscorlib]][]
101a0b70      509     25442268 System.Data.RBTree`1+Node[[System.Data.DataRow, System.Data]][]
025b19b4   761228     54808416 Importer.dsIBET+tblLogsDataRow
005244a8      390     85776048      Free
67abfe8c    16421     90050676 System.Object[]
67ad224c  9211552    294563252 System.String

As you can see there are a TON of objects still resident in memory that I would expect to not be present. In particular the tblLogsDataRow objects, and the System.String objects.

Here is where my knowledge of what to do enxt starts to get a little hazy. I have performed a !dumpstat -MT on some of these, for example the system.string, and then randomly pick one of the over 9 million string instances and do a !gcroot on it, and from that I can discern that the strings seem to be related to my entities, which at this point there should be none since at this point in the call stack, I have disposed of and nullified my object context. So I am unclear why all of these strings are resident in memory, or where exactly they are.

As far as the datarow objects, I also don't understand how there are any of these present because I store them in a list, then later clear that list and nullify it.

Clearly I'm not doing something right, but I'm a little lost as to how to figure that out.

So my question is two part:

1) Can you provide any pointers on how to use windbg to get any further information that might help me diagnose the source of the memory leaks?

2) Can you provide any insights into what I'm doing wrong generally as far as the dataset and entity framework go? I.e. something about the way I am storing the datarows in lists and then clearing those lists is not freeing the memory used by those lists, and disposing of my objectcontext does not appear to be releasing any of the resources that it utilizes.

Thanks, Josh

Thomas Weller
  • 55,411
  • 20
  • 125
  • 222
Josh
  • 99
  • 10

1 Answers1

0

Task Manager

Measuring memory with task manager is a mistake. Task manager by default shows the private working set, which is the memory in RAM only. That does not consider the parts which have been swapped to disk. What Task Manager displays depends heavily on how much memory other applications request. Task Manager does not have a column that would indicate memory leaks.

Use Process Explorer and let it show the "Private Bytes" column. But anyway, you were lucky and found out that your application used too much memory.

Garbage Collection

As you may know, .NET uses garbage collection. When you free objects, they are not collected immediately, they are just eligible for garbage collection. That means: the garbage collector needs to run in order to collect them. This may happen at any time later.

You might try doing a GC.Collect(); to force garbage collection. Your case might be one of the seldom scenarios where GC.Collect() is useful and allowed. You seem to know that you have allocated much memory, you removed the references and you want to free it now, e.g. to perform another of such a task.

Be aware of the consequences though. Make sure you read and understood the great Proper use of IDisposable answer.

!GCRoot

!gcroot is probably correct. The Strings may belong to an Entity. Don't worry about it. The garbage collector cannot only collect single objects, it will also free islands of objects floating around. So once the entity objects get collected, the Strings will disappear as well.

The more important question is: are the Entity objects rooted? If so, they won't be garbage collected. You may have disposed them, but if there's still a reference which you forgot, they are still alive.

WinDbg

At the moment you only check if the Entity objects are still rooted (!gcroot). Other than that I don't see much in WinDbg that would help.

Community
  • 1
  • 1
Thomas Weller
  • 55,411
  • 20
  • 125
  • 222
  • Thank you for the information.Unfortunately, I left out a key piece of information which is that as an attempt to force memory to be freed, after all of the data migration has completed, I dispose of and nullify everything and then execute GC.Collect, GC.WaitFullGCCollect and GC.Collect again, and it is after all of that that I took the memory dump. – Josh Feb 12 '15 at 18:43
  • As far as the entities being rooted, I suppose that is the case, but I don't know or how they are rooted. I create new entities, add them to the objectcontext, and submit changes, and then I dispose of the object context and nullify it before repeating that process again. I don't add any error handlers or anything else that I know of that would cause the entities to be rooted. Any thoughts on that? – Josh Feb 12 '15 at 18:46
  • @Josh: please add the output of `!gcroot` of an entity object to your question. – Thomas Weller Feb 13 '15 at 07:17