0

I have an hierarchical structure with millions of records. I'm doing a recursive scan on the DB in order to update some of the connections and some of the data. the problem is that I get an outofmemory exception since the entire DB is eventually loaded to the context (lazy). data that I no longer need stays in the context without any way of removing it. I also can't use Using(context...) since I need the context alive because I'm doing a recursive scan.

Please take the recursion as a fact. Thanks

Avi Harush
  • 989
  • 2
  • 10
  • 15

2 Answers2

2

This sort of an operation is really not handled well nor does it scale well using entities. I tend to resort to stored procedures for batch ops.

If you do want to remove/dump objects from context, I believe this post has some info (solution to your problem at the bottom).

Community
  • 1
  • 1
Gats
  • 3,452
  • 19
  • 20
  • Thanks, I'm familiar with that solution. to slow for my purpose. I've implemented a "bypass" DAL for my program. Thanks – Avi Harush Mar 27 '11 at 13:12
  • Me too. Looking forward to what Micorosoft comes up with for this sort of thing as I see it as a gaping hole in the EF functionality. Even deleting a bunch of records in a short term storage DB table is a relatively common scenario. – Gats Mar 27 '11 at 13:48
2

just ran into the same problem. I've used NHibernate before I used EF as ORM tool and it had exactly the same problem. These frameworks just keep the objects in memory as long as the context is alive, which has two consequences:

  • serious performance slowdown: the framework does comparisons between the objects in memory (e.g. to see if an object exists or not). You will notice a gradual degradation of performance when processing many records

  • you will eventually run out of memory.

If possible I always try to do large batch operation on the database using pure SQL (as the post above states clearly), but in this case that wasn't an option. So to solve this, what NHibernate has is a 'Clear' method on the session, which throws away all object in memory that refer to database records (new ones, added ones, corrupt ones...) I tried to mimic this method in entity framework as follows (using the post described above):

    public partial class MyEntities
    {

        public IEnumerable<ObjectStateEntry> GetAllObjectStateEntries()
        {
            return ObjectStateManager.GetObjectStateEntries(EntityState.Added |
                                                        EntityState.Deleted |
                                                        EntityState.Modified |
                                                        EntityState.Unchanged);
        }

        public void ClearEntities()
        {
            foreach (var objectStateEntry in GetAllObjectStateEntries())
            {
                Detach(objectStateEntry.Entity);
            }
        }
    }

The GetAllObjectStateEntries() method is taken separately because it's useful for other things. This goes into a partial class with the same name as your Entities class (the one EF generates, MyEntities in this example), so it is available on your entities instance.

I call this clear method now every 1000 records I process and my application that used to run for about 70 minutes (only about 400k entities to process, not even millions) does it in 25mins now. Memory used to peak to 300MB, now it stays around 50MB

pvolders
  • 191
  • 2
  • 5