3

Wee load large object graph from DB. The query has many Includes and Where()uses Contains() to filter the final result. Contains is called for the collection containing about thousand entries.

The profiler shows monstrous human-unreadable SQL. The query cannot be precompiled because of Contains().

Is there any ways for optimization of such queries?

Update

public List<Vulner> GetVulnersBySecurityObjectIds(int[] softwareIds, int[] productIds)
        {
            var sw = new Stopwatch();

            var query = from vulner in _businessModel.DataModel.VulnerSet
                        join vt in _businessModel.DataModel.ObjectVulnerTieSet.Where(ovt => softwareIds.Contains(ovt.SecurityObjectId))
                        on vulner.Id equals vt.VulnerId
                        select vulner;

            var result = ((ObjectQuery<Vulner>)query.OrderBy(v => v.Id).Distinct())
                .Include("Descriptions")
                .Include("Data")
                .Include("VulnerStatuses")
                .Include("GlobalIdentifiers")
                .Include("ObjectVulnerTies")
                .Include("Object.ProductObjectTies.Product")
                .Include("VulnerComment");

            //Если переданы конкретные продукты, добавляем фильтрацию
            if (productIds.HasValues())
                result = (ObjectQuery<Vulner>)result.Where(v => v.Object.ProductObjectTies.Any(p => productIds.Contains(p.ProductId)));

            sw.Start();
            var str = result.ToTraceString();
            sw.Stop();
            Debug.WriteLine("Сборка запроса заняла {0} секунд.", sw.Elapsed.TotalSeconds);
            sw.Restart();
            var list = result.ToList();
            sw.Stop();
            Debug.WriteLine("Получение уязвимостей заняло {0} секунд.", sw.Elapsed.TotalSeconds);

            return list;
        }
Pavel Voronin
  • 13,503
  • 7
  • 71
  • 137
  • Perhaps, but not without seeing an example of what you are trying to do... – PinnyM Nov 01 '12 at 20:01
  • Tomorrow only. Code is left at the office =) – Pavel Voronin Nov 01 '12 at 20:06
  • I think contains uses `like '%%'` which should be able to be tuned by fulltext indexing the column in sql – undefined Nov 02 '12 at 00:15
  • 1
    A solution to improve the performance of `Contains` is to split the `softwareIds` collection into smaller chunks and perform multiple queries: http://stackoverflow.com/a/7936350/270591 – Slauma Nov 02 '12 at 21:32

1 Answers1

2

It's almost certain that splitting the query in pieces performs better, in spite of more db round trips. It is always advised to limit the number of includes, because they not only blow up the size and complexity of the query (as you noticed) but also blow up the result set both in length and in width. Moreover, they often get translated into outer joins.

Apart from that, using Contains the way you do is OK.

Sorry, it is hard to be more specific without knowing your data model and the size of the tables involved.

Gert Arnold
  • 105,341
  • 31
  • 202
  • 291
  • 2
    Good article concerning EF performance http://msdn.microsoft.com/en-us/data/hh949853.aspx – Pavel Voronin Nov 03 '12 at 17:42
  • 1
    Great artice! Significant quotes concerning our question: "Once you include three or more entities into your query, consider switching to Lazy loading." And "It takes a relatively long time for a query with multiple Include statements in it to go through our internal plan compiler". And "Queries like this will bring in large connected graphs from your database in a single payload, which will acerbate any bandwidth issues". – Gert Arnold Nov 03 '12 at 17:53