I'm developing a system that, when working, will update a given number of rows in DB with an new ID.
Currently I have ~9 million rows and after selecting a small subset of that data (~10k items) I then want to iterate through each of these items and update one column with a new ID.
What I have currently does work but is too slow to be usable so I need to improve it somehow. Here's the code, this was originally a ForEach loop but I read that 'For' loops on Arrays were quicker so I tried that without much improvement.
public string UpdateWithNewWidgetId(List<string> allTheWidgetIds, int newWidgetId)
{
var repoPerson = _repositoryFactory.CreateRepository<Person>();
try
{
var person = repoPerson.GetAll(x => fieldToUpdate.Any(val => x.WidgetId == val)).ToArray;
for (int i = 0; i < person.Length; i++)
{
person[i].WidgetId= newWidgetId;
repoPerson.Update(person[i]);
}
repoPerson.SaveChanges();
}
catch (Exception ex)
{
//log error
}
So, the repoPerson returns a list of about 10K items pretty quickly but the foreach loop takes ages and then the SaveChanges (this just does a Context.SaveChanges()) takes even longer.
Initially I had the SaveChanges within the foreach and thought that was my issue, but removing it from the loop hasn't improved things.