I have a list of Foo
coming into a process (not entities) and I need to make a 1-1 equivalence on a specific property using Entity Framework Core, having a Conversion
table. The Conversion
table is quite stable, and both properties are indexed.
I am wondering if there is a recommended way of doing it, as a rule of thumb, to get the best performances. I have around 10k Equivalence in the table.
Here are the three ways I am thinking of:
- Query the
Conversion
table for every record, the key property being indexed. But it makes a lot of queries to the database, which is usually not recommended. - Load the whole
Conversion
table in memory anytime I process a batch ofFoo
. But it loads a lot of data, most of them being useless. - Load only the
Conversion
table part I need for a specific batch, but it requires to send a List of properties to check against the table to the database, which I've usually found to be highly inefficient as well. There is extensive discussions about using the Contains operator with EntityFramework, like this one but most of them are quite old.
Is there any way to guess depending on (datatypes / number of records / frequency of batches / db performance / etc. ) or do I have to test extensively?
public class Foo
{
[MaxLength(64)]
public string PreferedAnimal { get; set; }
/// <summary>
/// Not populated at the entry point of the process
/// </summary>
[NotMapped]
public string AssignedPetShop { get; set; }
}
public class Conversion
{
[Key]
public int Id { get; set; }
[MaxLength(64)]
public string PreferedAnimal { get; set; }
[MaxLength(64)]
public string AssignedPetShop { get; set; }
}
public class FooPetShopMatcher
{
private DbContext1 _context;
public FooPetShopMatcher(DbContext1 context)
{
_context = context;
}
public void MatchFoos1(ref List<Foo> foos)
{
foreach (Foo foo in foos)
{
//Sends N query to the database
foo.AssignedPetShop = _context.Conversion.Single(x => x.PreferedAnimal == foo.PreferedAnimal).AssignedPetShop;
}
return;
}
public void MatchFoos2(ref List<Foo> foos)
{
//Sends 1 query to the database but loads a lot of useless data
var dic = _context.Conversion.ToDictionary(x => x.PreferedAnimal, x => x.AssignedPetShop);
foreach (Foo foo in foos)
{
foo.AssignedPetShop = dic[foo.PreferedAnimal];
}
return;
}
public void MatchFoos3(ref List<Foo> foos)
{
var preferedAnimals = foos.Select(x => x.PreferedAnimal).Distinct().ToArray();
//Sends 1 query to the database but send a big list, which seems to be highly inefficient
var dic = _context.Conversion
.Where(x=> preferedAnimals.Contains(x.PreferedAnimal))
.ToDictionary(x => x.PreferedAnimal, x => x.AssignedPetShop);
foreach (Foo foo in foos) {
foo.AssignedPetShop = dic[foo.PreferedAnimal];
}
return;
}
}
EDIT
Following @Maxim Zabolotskikh comment, here are some insights :
- There are 10k
Conversion
items, and once the table is filled, it never changes under normal operations Foos
are coming by bath of 50-100, once every minute- We can assume that
Foos
are random