1

I have the following entities (I'll show the properties I'm working with because I don't want to make it larger than needed):

PROPERTY: Where a property can be child of another one and has a 1-1 relationship with GeoLocation and can have multiple Multimedia and Operation

public partial class Property
{
    public Property()
    {
        InverseParent = new HashSet<Property>();
        Multimedia = new HashSet<Multimedia>();
        Operation = new HashSet<Operation>();
    }

    public long Id { get; set; }
    public string GeneratedTitle { get; set; }
    public string Url { get; set; }
    public DateTime? DatePublished { get; set; }
    public byte StatusCode { get; set; }
    public byte Domain { get; set; }
    public long? ParentId { get; set; }

    public virtual Property Parent { get; set; }
    public virtual GeoLocation GeoLocation { get; set; }
    public virtual ICollection<Property> InverseParent { get; set; }
    public virtual ICollection<Multimedia> Multimedia { get; set; }
    public virtual ICollection<Operation> Operation { get; set; }
}

GEOLOCATION: As mentioned, it has a 1-1 relationship with Property

public partial class GeoLocation
{
    public int Id { get; set; }
    public double? Latitude { get; set; }
    public double? Longitude { get; set; }
    public long? PropertyId { get; set; }

    public virtual Property Property { get; set; }
}

MULTIMEDIA: it can hold multiple Images, with different sizes, for a single Property. The detail here is that Order specifies the order of the images to be shown in the client application, but it doesn't start always with 1. There're some cases where a Property has Multimedia files that starts with 3 or x.

public partial class Multimedia
{
    public long Id { get; set; }
    public long? Order { get; set; }
    public string Resize360x266 { get; set; }
    public long? PropertyId { get; set; }

    public virtual Property Property { get; set; }
}

OPERATIONS: defines all the operations a Property can have, using OperationType to name this operation. (rent, sell, etc.)

public partial class Operation
{
    public Operation()
    {
        Price = new HashSet<Price>();
    }

    public long Id { get; set; }
    public long? OperationTypeId { get; set; }
    public long? PropertyId { get; set; }

    public virtual OperationType OperationType { get; set; }
    public virtual Property Property { get; set; }
    public virtual ICollection<Price> Price { get; set; }
}

public partial class OperationType
{
    public OperationType()
    {
        Operation = new HashSet<Operation>();
    }

    public long Id { get; set; }
    public string Name { get; set; }

    public virtual ICollection<Operation> Operation { get; set; }
}

PRICE: defines the price for each operation and the currency type. (i.e.: A property can have the rent option - Operation - for X amount in USD currency, but another price registered for the same Operation in case of use another currency type )

public partial class Price
{
    public long Id { get; set; }
    public float? Amount { get; set; }
    public string CurrencyCode { get; set; }
    public long? OperationId { get; set; }

    public virtual Operation Operation { get; set; }
}

Said that, I want to get all the records (actually are about 40K-50K), but only for a few properties. As I mentioned before, the Multimedia table can have a lot of records for each Property, but I only need the first one with the smaller Order value and sorted by DatePublished. After that, I need to convert the result into MapMarker object, which is as follows:

public class MapMarker : EstateBase
{
    public long Price { get; set; }
    public int Category { get; set; }
    public List<Tuple<string, string, string>> Prices { get; set; }
}

In order to achieve this, I made the following:

public async Task<IEnumerable<MapMarker>> GetGeolocatedPropertiesAsync(int quantity)
{
    var properties = await GetAllProperties().AsNoTracking()
        .Include(g => g.GeoLocation)
        .Include(m => m.Multimedia)
        .Include(p => p.Operation).ThenInclude(o => o.Price)
        .Include(p => p.Operation).ThenInclude(o => o.OperationType)
        .Where(p => p.GeoLocation != null 
            && !string.IsNullOrEmpty(p.GeoLocation.Address) 
            && p.GeoLocation.Longitude != null 
            && p.GeoLocation.Latitude != null 
            && p.StatusCode == (byte)StatusCode.Online 
            && p.Operation.Count > 0)
        .OrderByDescending(p => p.ModificationDate)
        .Take(quantity)
        .Select(p => new {
            p.Id,
            p.Url,
            p.GeneratedTitle,
            p.GeoLocation.Address,
            p.GeoLocation.Latitude,
            p.GeoLocation.Longitude,
            p.Domain,
            p.Operation,
            p.Multimedia.OrderBy(m => m.Order).FirstOrDefault().Resize360x266
        })
        .ToListAsync();

    var mapMarkers = new List<MapMarker>();

    try
    {
        foreach (var property in properties)
        {
            var mapMarker = new MapMarker();
            mapMarker.Id = property.Id.ToString();
            mapMarker.Url = property.Url;
            mapMarker.Title = property.GeneratedTitle ?? string.Empty;
            mapMarker.Address = property.Address ?? string.Empty;
            mapMarker.Latitude = property.Latitude.ToString() ?? string.Empty;
            mapMarker.Longitude = property.Longitude.ToString() ?? string.Empty;
            mapMarker.Domain = ((Domain)Enum.ToObject(typeof(Domain), property.Domain)).ToString();
            mapMarker.Image = property.Resize360x266 ?? string.Empty;
            mapMarker.Prices = new List<Tuple<string, string, string>>();
            foreach (var operation in property.Operation)
            {
                foreach (var price in operation.Price)
                {
                    var singlePrice = new Tuple<string, string, string>(operation.OperationType.Name, price.CurrencyCode, price.Amount.ToString());
                    mapMarker.Prices.Add(singlePrice);
                }
            }
            mapMarkers.Add(mapMarker);
        }
    }
    catch (Exception ex)
    {

        throw;
    }

    return mapMarkers;
}

but the results take more than 14 minutes and this method could be called multiple times in a minute. I want to optimize it to return the results in the less time possible. I alreay tried removing ToListAsync(), but in the foreach loop it takes a lot of time too, and that makes all the sense.

So, what do you think I can do here? Thanks in advance.

UPDATE: Here is GetAllProperties() method, I forgot to include this one.

private IQueryable<Property> GetAllProperties()
{
    return _dbContext.Property.AsQueryable();
}

And the SQL query that Entity Framework is making against SQL Server:

SELECT [p].[Id], [p].[Url], [p].[GeneratedTitle], [g].[Address], [g].[Latitude], [g].[Longitude], [p].[Domain], (
    SELECT TOP(1) [m].[Resize360x266]
    FROM [Multimedia] AS [m]
    WHERE [p].[Id] = [m].[PropertyId]
    ORDER BY [m].[Order]), [t].[Id], [t].[CreationDate], [t].[ModificationDate], [t].[OperationTypeId], [t].[PropertyId], [t].[Id0], [t].[CreationDate0], [t].[ModificationDate0], [t].[Name], [t].[Id1], [t].[Amount], [t].[CreationDate1], [t].[CurrencyCode], [t].[ModificationDate1], [t].[OperationId]
FROM [Property] AS [p]
LEFT JOIN [GeoLocation] AS [g] ON [p].[Id] = [g].[PropertyId]
LEFT JOIN (
    SELECT [o].[Id], [o].[CreationDate], [o].[ModificationDate], [o].[OperationTypeId], [o].[PropertyId], [o0].[Id] AS [Id0], [o0].[CreationDate] AS [CreationDate0], [o0].[ModificationDate] AS [ModificationDate0], [o0].[Name], [p0].[Id] AS [Id1], [p0].[Amount], [p0].[CreationDate] AS [CreationDate1], [p0].[CurrencyCode], [p0].[ModificationDate] AS [ModificationDate1], [p0].[OperationId]
    FROM [Operation] AS [o]
    LEFT JOIN [OperationType] AS [o0] ON [o].[OperationTypeId] = [o0].[Id]
    LEFT JOIN [Price] AS [p0] ON [o].[Id] = [p0].[OperationId]
) AS [t] ON [p].[Id] = [t].[PropertyId]
WHERE (((([g].[Id] IS NOT NULL AND ([g].[Address] IS NOT NULL AND (([g].[Address] <> N'') OR [g].[Address] IS NULL))) AND [g].[Longitude] IS NOT NULL) AND [g].[Latitude] IS NOT NULL) AND ([p].[StatusCode] = CAST(1 AS tinyint))) AND ((
    SELECT COUNT(*)
    FROM [Operation] AS [o1]
    WHERE [p].[Id] = [o1].[PropertyId]) > 0)
ORDER BY [p].[ModificationDate] DESC, [p].[Id], [t].[Id], [t].[Id1]

UPDATE 2: As @Igor mentioned, this is the link of the Execution Plan Result: https://www.brentozar.com/pastetheplan/?id=BJNz9KdQI

  • What is `GetAllProperties`()? – MSpiller Feb 17 '20 at 20:36
  • @M.Spiller I just included that one at the end. I forgot that. – Adolfo F. Ibarra Landeo Feb 17 '20 at 20:38
  • 1
    Is it the database query/queries or is it the code? Look at what Sql Server is doing. There are many ways to do this but one easy one is using Sql Profiler. Analyze the resulting query and it's query plan. You can also analyze the .net performance using various tools (some built into VS or a 3rd party tool). – Igor Feb 17 '20 at 20:44
  • @Igor I just added the SQL consult that EF makes at the end of the question. – Adolfo F. Ibarra Landeo Feb 17 '20 at 20:49
  • So it's the query. Now get the plan and analyze that. [Include the actual Execution Plan](https://stackoverflow.com/a/7359705/1260204), you could use [Paste the Plan](https://www.brentozar.com/pastetheplan/) and share the link in your question. Also [try to read it yourself](https://stackoverflow.com/a/759097/1260204), maybe you can figure out the performance issue(s) with your query. – Igor Feb 17 '20 at 20:53
  • And you can remove all the `.Include` clauses. They don't do anything if you project a custom shape in your `.Select()` – David Browne - Microsoft Feb 17 '20 at 21:02
  • try to "play" with query using the SQL Management Studio by removing certain parts of query. For example, try to delete the subquery `SELECT TOP(1) [m].[Resize360x266] FROM [Multimedia] ` (and below). I suppose that images can weight a lot and it takes additional time just to download the image (if you have results shown in grid). Also I'd like to mention that getting 40K of records multiple times per minute will not be too fast in any case. Just to download all records and image will take some time for sure. Can you add some filtering (like `top 100` or with `where`) ? – oleksa Feb 17 '20 at 21:05
  • @oleksa, I'm not downloading the images, just getting the Url in string format. Removing the multimedia part you mentioned works faster, but I need that property, so it's not an option. – Adolfo F. Ibarra Landeo Feb 17 '20 at 21:09
  • I can't stress this enough: Get the [actual Execution Plan](https://stackoverflow.com/a/7359705/1260204) for your query. Study it. That is where you will see what is actually happening and what you can do to make it faster. – Igor Feb 17 '20 at 21:10
  • @DavidBrowne-Microsoft thanks, I already did that. I'll include the results asked for Igor to see if that helps. Also, I already updated the SQL statement that EF makes to SQL Server – Adolfo F. Ibarra Landeo Feb 17 '20 at 21:10
  • 1
    @Igor, I'm on it right now. I'll publish the results when finished. – Adolfo F. Ibarra Landeo Feb 17 '20 at 21:11
  • the `Property` class has primary index defined as `public long Id` in the source code. But `Geolocation` has `public int Id`. Do you have the same in the databse? I've seen performance degradation in such cases. SQL does not use the index and does fullscans instead if joined columns have different types. Please check that `Geolocation` data table has the same `Id` data type as `Property` – oleksa Feb 17 '20 at 21:24
  • You should / could use the SQL Server Profiler to sniff the executed query. I had some issues with varchar columns, because the entity framework casts all strings to nvarchar so the SQL server had to cast all the values back to varchar. Have a look at the executed query and analyse it to get further information of the query itself and the used execution plan. – Sebastian S. Feb 17 '20 at 21:38
  • @Igor I just updated the question with the plan asked before. I'm trying to figure out what's could be happening here. – Adolfo F. Ibarra Landeo Feb 17 '20 at 21:51
  • Did Sql Server have any index hints in the plan (green and at the top)? – Igor Feb 17 '20 at 21:52
  • Off the bat I see 1 key lookup that is 25% of the cost (bottom). You can change the index being used to include the other columns, that will get rid of that. – Igor Feb 17 '20 at 21:53
  • It also looks like you probably also have some parameter sniffing issues. What happens when you reexecute the query in SSMS and add `OPTION (RECOMPILE)` to the end of the statement? – Igor Feb 17 '20 at 21:58

1 Answers1

1

Ok, a few things that should help. #1. .Include() and .Select() should in general be treated mutually exclusive.

You are selecting:

p.Id,
p.Url,
p.GeneratedTitle,
p.GeoLocation.Address,
p.GeoLocation.Latitude,
p.GeoLocation.Longitude,
p.Domain,
p.Operation,
p.Multimedia.OrderBy(m => m.Order).FirstOrDefault().Resize360x266

but then in your foreach loop accessing Price and OperationType entities off it.

Edit Updated the example for the collection of operation. (Whups)

Instead I would recommend:

p.Id,
p.Url,
p.GeneratedTitle,
p.GeoLocation.Address,
p.GeoLocation.Latitude,
p.GeoLocation.Longitude,
p.Domain,
Operations = p.Operation.Select( o => new 
{
   OperationTypeName = o.OperationType.Name,
   o.Price.Amount,
   o.Price.CurrencyCode
}).ToList(),
p.Multimedia.OrderBy(m => m.Order).FirstOrDefault().Resize360x266

Then adjust your foreach logic to use the returned properties rather than a returned entity and related entity values.

Loading 40-50k records with something like that image field (MultiMedia) is potentially always going to be problematic. Why do you need to load all 50k in one go?

This looks like something that would put markers on a map. Solutions like this should consider applying a radius filter at the very least to get markers within a reasonable radius of a given center point on a map, or if loading a larger area (zoomed out map) calculating regions and filtering data by region or getting a count falling in that region and loading/rendering the locations in batches of 100 or so rather than potentially waiting for all locations to load. Something to consider.

Steve Py
  • 26,149
  • 3
  • 25
  • 43
  • Thanks for that. I think you're right about what could be the best approach in this scenario. I really appreciate that information. But, talking about the foreach loop, I can't just get the Name, Amount and CurrencyCode properties, because they are in collections and is not possible to access that way – Adolfo F. Ibarra Landeo Feb 17 '20 at 21:56
  • Ah yes, I'd seen that initially but slipped my memory when I was writing the example. I've updated the example to retrieve that collection of operation values using the `Select` which should help. – Steve Py Feb 17 '20 at 22:05
  • Well, after trying and trying for days. I finally applied what you mentioned here, I think it's a better approach and it's better to use good strategy instead of trying to "fix" and make work something that is using a bad design in principle. Many thanks... my applications works like a charm right now. – Adolfo F. Ibarra Landeo Feb 21 '20 at 20:29