2

I'm using EF5 to migrate some data from one database to another. I would generally use SQL for something like this however I need other functionality (like creating users in the MembershipProvider) and was hoping to do it all in EF. I'm migrating about 100k rows and using this to do so:

        using (var connection = new SqlConnection(connectionString))
        { 
            using(var command = new SqlCommand(commandText, connection))
            {
                connection.Open();
                using (SqlDataReader reader = command.ExecuteReader())
                {
                    while (reader.Read())
                    {
                        var employer = new Employer();

                        employer.EAN = reader["EAN"].ToString();

                        employer.Name = GetString(reader["EmpName"]);

                        employer.TaxMailingAddress = new Address
                        {
                            StreetAddress = GetString(reader["Street"]),
                            City = GetString(reader["City"]),
                            State = GetString(reader["USState"]),
                            ZipCode = GetString(reader["Zip"])
                        };

                        employer.TaxMailingAddress.SaveOrUpdate();
                        employer.SaveOrUpdate(); // This is where the timeout happens

                        string dba = GetString(reader["DBA"]);
                        if (!string.IsNullOrWhiteSpace(dba))
                        {
                            employer.AddDBA(new EmployerDba 
                            {
                                Name = dba
                            });
                        }

                        string email = GetString(reader["Email"]);
                        if (!string.IsNullOrWhiteSpace(email))
                        {
                            var user = CreateNewUser(email);

                            if (user != null)
                            {
                                user.AddAuthorizedEmployer(employer);

                                user.AddRole(EmployerRole.Admin, employer, true);
                            }
                        }
                    }
                }
            }
        }

My SaveOrUpdate method is pretty straight forward:

    public void SaveOrUpdate()
    {
        using (var db = new MyContext())
        {
            if (db.Employers.FirstOrDefault(x => x.EAN == EAN && x.Id != Id) != null)
                throw new Exception("An employer with this EAN has already been registered.");

            var employer = new Employer();

            if (Id == 0)
            {
                db.Employers.Add(employer);

                employer.CreatedBy = Statics.GetCurrentUserName();
                employer.DateCreated = DateTime.Now;
            }
            else
            {
                employer = db.Employers.FirstOrDefault(x => x.Id == Id);

                employer.ModifiedBy = Statics.GetCurrentUserName();
                employer.DateModified = DateTime.Now;
            }

            employer.EAN = EAN;
            employer.Name = Name;
            if (TaxMailingAddress != null) employer.TaxMailingAddress = db.Addresses.FirstOrDefault(x => x.Id == TaxMailingAddress.Id);
            if (SingleSeparationStatementAddress != null) employer.SingleSeparationStatementAddress = db.Addresses
                                    .FirstOrDefault(x => x.Id == SingleSeparationStatementAddress.Id);

            db.SaveChanges();

            Id = employer.Id;
        }
    }

The task should take about 2.5 hours to complete. However, after running many thousands of rows, sometime 80k, sometimes as few as 7k, I get this "The wait operation timed out" exception, always on the employer.SaveOrUpdate();. Could it have anything to do with how close it is to the employer.TaxMailingAddress.SaveOrUpdate();? Is there a "wait for the transaction to complete" deal? Perhaps make sure the connection is valid and if not try recreating it or something? Thanks for any help.

Serj Sagan
  • 28,927
  • 17
  • 154
  • 183
  • 1
    By default there's a 30/60 second timeout for any commands run against the DB. You can override this setting in your DbContext. See: http://stackoverflow.com/questions/10549640/how-to-set-commandtimeout-for-dbcontext. That said, doing this kind of thing in pure EF isn't the greatest idea. – Kittoes0124 Oct 07 '13 at 16:09
  • @Kittoes +1 EF adds a HUGE overhead to your process. Every piece of data being sent through is going to use up memory as it is converted into an object. – Aron Oct 07 '13 at 17:37
  • Aye, even though EF is one of my favorite tools ever created it still has a time and place. There are times where I load data like this but they are always one off single use scenarios. Processes like the above should NEVER make it into production. – Kittoes0124 Oct 07 '13 at 21:15
  • Right, this would be a single use data import deal...but I can't even get the single use to complete successfully – Serj Sagan Oct 07 '13 at 21:51

1 Answers1

3

The problem ended up being that my initial database connection that was being used to connect to the other database, from which I was getting the data to import, was timing out. So I ended up looping through all of the reader.Read() statements and putting that into an array. I then looped through that array to actually process and save the data into my new database.

Serj Sagan
  • 28,927
  • 17
  • 154
  • 183