1

I have a huge list of strings and I want to compare a database table's records with it. What is the best solution?

You can suppose specified table's name is ATable and its structure is like:

public class ATable
{
    [Key]
    public long Id{get;set;}
    [Key]
    public long Name{get;set;}
}

I wrote the following code

using(var context = new MyDbContext())
{
    context.Log = (log) => {Console.WriteLine(log)};
    var result = context.ATables.Where(item => hugeList.Contains(item.Name)).ToList();
}

I checked generated logs and I saw that above code translated to SQL IN(...) statement and because of hugeness of list application crash.

I'm sure there is a good way to solve this problem, then you professionals can show me right one.

Thanks

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Saeed Afshari
  • 939
  • 1
  • 9
  • 17

1 Answers1

2

Since EF 6 Alpha 1, EF includes an improvement that accelerates the translation of Enumerable.Contains.

If you are using an earlier version of EF or the size of your list is too big, as propose @Dan-o you can break the huge list in smaller chunks. To do this you can use the solution that @divega wrote in this post. Below I adapt that solution to your problem context:

public partial class MyDbContext
{
    public IEnumerable<ATable> GetElements(IEnumerable<long> hugeList, int chunkSize = 100)
    {
        foreach (var chunk in hugeList.Chunk(chunkSize))
        {
            var q = ATables.Where(a => chunk.Contains(a.Name));
            foreach (var item in q)
            {
                yield return item;
            }
        }
    }
}

Extension methods for slicing enumerable sequences:

public static class EnumerableSlicing
{

    private class Status
    {
        public bool EndOfSequence;
    }

    private static IEnumerable<T> TakeOnEnumerator<T>(IEnumerator<T> enumerator, int count,
        Status status)
    {
        while (--count > 0 && (enumerator.MoveNext() || !(status.EndOfSequence = true)))
        {
            yield return enumerator.Current;
        }
    }

    public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> items, int chunkSize)
    {
        if (chunkSize < 1)
        {
            throw new ArgumentException("Chunks should not be smaller than 1 element");
        }
        var status = new Status { EndOfSequence = false };
        using (var enumerator = items.GetEnumerator())
        {
            while (!status.EndOfSequence)
            {
                yield return TakeOnEnumerator(enumerator, chunkSize, status);
            }
        }
    }
}

Then you can do something like this:

var result= context.GetElements(hugelist).ToList();
Community
  • 1
  • 1
ocuenca
  • 38,548
  • 11
  • 89
  • 102