Got a kind of edge case issue here. I've been tasked with pulling all data from one database to another, where the destination database has a different schema.
I've chosen to write a WinForms utility to do the data mapping and transfer with Entity Framework/ADO.NET when necessary.
This has worked great so far, except for this one specific table that has 2.5 million records. The transfer is about 10 minutes total when I disregard all foreign keys, however when I start mapping foreign keys with FirstOrDefault()
calls against in memory lists of data that have been already moved to the destination database, quite literally 4 days are added to the amount of time that it takes.
I'm going to need to run this tool a lot over the coming days so this isn't really acceptable for me.
Here's my current approach (Not my first approach, this is the result of much trial and error for efficiencies sake):
private OldModelContext _oldModelContext { get; } //instantiated in controller
using (var newModelContext = new NewModelContext())
{
//Takes no time at all to load these into memory, collections are small, 3 - 20 records each
var alreadyMigratedTable1 = newModelContext.alreadyMigratedTable1.ToList();
var alreadyMigratedTable2 = newModelContext.alreadyMigratedTable2.ToList();
var alreadyMigratedTable3 = newModelContext.alreadyMigratedTable3.ToList();
var alreadyMigratedTable4 = newModelContext.alreadyMigratedTable4.ToList();
var alreadyMigratedTable5 = newModelContext.alreadyMigratedTable5.ToList();
var oldDatasetInMemory = _oldModelContext.MasterData.AsNoTracking().ToList();//2.5 Million records, takes about 6 minutes
var table = new DataTable("MasterData");
table.Columns.Add("Column1");
table.Columns.Add("Column2");
table.Columns.Add("Column3");
table.Columns.Add("ForeignKeyColumn1");
table.Columns.Add("ForeignKeyColumn2");
table.Columns.Add("ForeignKeyColumn3");
table.Columns.Add("ForeignKeyColumn4");
table.Columns.Add("ForeignKeyColumn5");
foreach(var masterData in oldDatasetInMemory){
DataRow row = table.NewRow();
//With just these properties mapped, this takes about 2 minutes for all 2.5 Million
row["Column1"] = masterData.Property1;
row["Column2"] = masterData.Property2;
row["Column3"] = masterData.Property3;
//With this mapping, we add about 4 days to the overall process.
row["ForeignKeyColumn1"] = alreadyMigratedTable1.FirstOrDefault(s => s.uniquePropertyOnNewDataset == masterData.uniquePropertyOnOldDataset);
row["ForeignKeyColumn2"] = alreadyMigratedTable2.FirstOrDefault(s => s.uniquePropertyOnNewDataset == masterData.uniquePropertyOnOldDataset);
row["ForeignKeyColumn3"] = alreadyMigratedTable3.FirstOrDefault(s => s.uniquePropertyOnNewDataset == masterData.uniquePropertyOnOldDataset);
row["ForeignKeyColumn4"] = alreadyMigratedTable4.FirstOrDefault(s => s.uniquePropertyOnNewDataset == masterData.uniquePropertyOnOldDataset);
row["ForeignKeyColumn5"] = alreadyMigratedTable5.FirstOrDefault(s => s.uniquePropertyOnNewDataset == masterData.uniquePropertyOnOldDataset);
table.Rows.Add(row);
}
//Save table with SQLBulkCopy is very fast, takes about a minute and a half.
}
}
Note: uniquePropertyOn(New/Old)Dataset
is most often a unique description string shared among the datasets, can't match Ids as they won't be the same across databases.
I have tried:
- Instead of using a foreach, cast with a linq
select
statement, not much improvement was had. - Use
.Where(predicate).FirstOrDefault()
, didn't see any considerable improvement - Running
FirstOrDefault()
against iqueryable instead of lists of migrated data, didn't see any improvement. - Mapping to a List instead of a datatable, but that makes no difference in the mapping speed, and also makes bulk saves slower.
I've been messing around with the idea of turning the foreach
into a parallel foreach loop and locking the calls to the datatable, but I keep running into
Entity Framework connection closed issues
when querying the in memory lists while using the parallel foreach.... not really sure what that's about but initially the speed results were promising.
I'd be happy to post that code/errors if anyone thinks it's the right road to go down, but i'm not sure anymore..