I'm trying to import around 900K rows of data and map it to my new data model. My problem is that the console application I have build for this import functionality slows down over time.
I have monitored the SQL queries and they all perform well (<5ms). I tried to import smaller chunks, fx 1K rows, at a time. The stopwatch timings looks like this:
- Count: 100 | Avg ms: 36
- Count: 200 | Avg ms: 67
- Count: 300 | Avg ms: 106
- Count: 400 | Avg ms: 145
- Count: 500 | Avg ms: 183
- Count: 600 | Avg ms: 222
- Count: 700 | Avg ms: 258
- Count: 800 | Avg ms: 299
- Count: 900 | Avg ms: 344
- Count: 1000 | Avg ms: 376
When restarting the application with a new chunk of 1K rows, the timings are similar.
The import data has the following format:
public class ImportData
{
public int Id { get; set; }
public int TaxpayerId { get; set; }
public string CustomerName { get; set; }
public string Email { get; set; }
public string PhoneNumber { get;set; }
}
A simplified example of my data model looks like this:
public class Channel
{
public int Id { get; set; }
public string Name { get; set; }
}
public class Permission
{
public Guid Id { get; set; }
public Channel Channel { get; set; }
public string Recipient { get; set; }
}
public class Taxpayer
{
public Guid Id { get; set; }
public int TaxpayerId { get; set; }
public string Name { get; set; }
public List<Permission> Permissions { get; set; }
}
My import method looks like this:
public void Import()
{
Stopwatch stopwatch = new Stopwatch();
//Get import data
List<ImportData> importDataList = _dal.GetImportData();
stopwatch.Start();
for (int i = 0; i < importDataList.Count; i++)
{
ImportData importData = importDataList[i];
Taxpayer taxpayer = new Taxpayer()
{
Name = importData.CustomerName,
TaxpayerId = importData.TaxpayerId,
Permissions = new List<Permission>()
};
//Does not call SaveChanges on the context
CreateTaxpayer(taxpayer, false);
//Create permissions
if (!string.IsNullOrWhiteSpace(importData.Email))
{
//Does not call SaveChanges on the context
CreatePermission(_channelIdEmail, importData.Email, taxpayer, PermissionLogType.PermissionRequestAccepted);
}
if (!string.IsNullOrWhiteSpace(importData.PhoneNumber))
{
//Does not call SaveChanges on the context
CreatePermission(_channelIdPhoneCall, importData.PhoneNumber, taxpayer, PermissionLogType.PermissionRequestAccepted);
//Does not call SaveChanges on the context
CreatePermission(_channelIdSms, importData.PhoneNumber, taxpayer, PermissionLogType.PermissionRequestAccepted);
}
if ((i + 1) % 100 == 0)
{
Console.WriteLine("Count: " + (i + 1) + " | Avg ms: " + stopwatch.ElapsedMilliseconds / 100);
stopwatch.Restart();
}
}
_dal.SaveChanges();
}
I tried the following:
- reduce the number of calls to SaveChanges (only called at once at the end)
- implement multithreading (without luck) - it does not seem to go hand in hand with entity framework
I am running out of ideas here. Do you guys have any suggestions to solve this performance issue?