0

Unfortunately I haven't found a good answer for this problem yet. The answers and questions I have seen so far in here are about big tables with a lot of records.

I'm trying to query a table called Tickets with the following code:

var Status = ticketStatusService.GetByName("New");
string StatusID = Status.Id;
var tickets = db.Tickets.Where(e => 
              !e.Deleted && 
              e.Project == null && 
              e.Status != null && 
              e.Status.Id == StatusID);
var list = tickets.ToList();

The table currently has less than 100 records, this query takes an average of 22 seconds to execute.

The code first model for it is as follows:

public class Ticket : Base
{
    [Key]
    [Required]
    public Guid Id { get; set; }

    [Display(Name = "Date")]
    public DateTime RowDate { get; set; } = DateTime.Now;

    public bool Deleted { get; set; } = false;

    [Index(IsUnique = true)]
    public int? Number { get; set; }

    [Display(Name = "Ticket Subject")]
    public string Subject { get; set; }

    [Display(Name = "Notes (Employees Only)")]
    public string Notes { get; set; }

    [Display(Name = "E-Mail")]
    public string From { get; set; }

    [Display(Name = "Phone Number")]
    public string Phone { get; set; }

    [Display(Name = "Secondary Phone Number")]
    public string PhoneAlt { get; set; }

    [Display(Name = "Client Name")]
    public string Name { get; set; }

    [Display(Name = "Message")]
    public string Messages { get; set; }

    [DataType(DataType.DateTime)]
    public DateTime? OpenDate { get; set; }

    [DataType(DataType.DateTime)]
    public DateTime? CloseDate { get; set; }

    [DataType(DataType.DateTime)]
    public DateTime? AssignedDate { get; set; }

    public bool? Origin { get; set; }

    public virtual User AssignedUser { get; set; }

    public virtual List<TicketFile> TicketFiles { get; set; }

    public virtual List<Task> Tasks { get; set; }

    public virtual Project Project { get; set; }

    public virtual TicketStatus Status { get; set; }

    public virtual TicketClosingCategory TicketClosingCategory { get; set; }
    public virtual TicketGroup TicketGroup { get; set; }

    public virtual TicketPriority TicketPriority { get; set; }
}

Any insight into this issue would be appreciated. Thank you very much!

Edit: Running the same query directly on SQL Server Management Studio also takes very long, about 9 to 11 seconds. So there might be an issue with the table itself.

Ricardo Melo
  • 459
  • 1
  • 4
  • 14
  • 2
    Did you view the SQL statement in the profiler? If you grab that SQL and run it directly (such as via SSMS) does it take 22 seconds? – mason Feb 13 '19 at 21:55
  • If it does be sure to check the query execution plan. It should provide insight in what's going wrong. Personally, I suspect your usage of the navigation properties might have something to do with it. Maybe it generates some joins that don't behave in the way expected? – Sangman Feb 13 '19 at 22:24
  • You could consider using LINQPad to help analyze as well. – NetMage Feb 13 '19 at 22:42
  • Thank you for your replies, I have checked it in SSMS and it actually takes less time, but still a very large time. I'm not sure if this is related, but I have checked and apparently the fragmentation percentage of my tables is very high. That specific table is over 90%. I have been reading and it could be something related to how Guids are inserted and since the rows are randomly placed in the table that could be causing issues. I'm not sure if this is the case though. Any ideas are very welcome. – Ricardo Melo Feb 13 '19 at 23:17
  • As @NetMage suggested I tried LINQPad, it's faster than the actual code, but still takes a considerable amount of time. I'm not well versed with this tool so I don't know what else I could do on it to get more insight. – Ricardo Melo Feb 13 '19 at 23:40
  • Fragmentation could have an impact on performance, but I'm not certain it would be to such a degree as you're experiencing on such a small table. Have you checked the execution plan? It'll tell you how the query engine interprets your query and which percentage of the time is spent on what sort of action. – Sangman Feb 14 '19 at 00:12
  • Also when you check the query execution plan with SSMS, be on the lookout for missing indices. I'm guessing most of the filtering columns already have an index (albeit fragmented), but the Deleted column might benefit from an extra index, if it doesn't already have one. But in the end if all else fails and you only have a small amount of records and you don't expect it to grow over time (but only then!), you can even try to read everything to the memory and do the query there. – Akos Nagy Feb 14 '19 at 07:22
  • Related: [What are the best practices for using a GUID as a primary key ...](https://stackoverflow.com/questions/11938044/what-are-the-best-practices-for-using-a-guid-as-a-primary-key-specifically-rega) – Hans Kesting Feb 14 '19 at 08:17
  • LINQPad has a SQL results tab for viewing the SQL from a LINQ query you run. Also, if you used SSMS perhaps you could post the SQL you used there that takes too long? – NetMage Feb 14 '19 at 19:37
  • Hello NetMage, I figured out the issue actually. The entries in the database were randomly generated from email messages, and the Message property has a ton of text in each entry that I wasn't aware of, even though the number of records was very small, the size of each record was about 30kb each, so the response was big enough to generate those unexpected delays. Rookie mistake I know, but now it's resolved. – Ricardo Melo Feb 15 '19 at 02:56

1 Answers1

2

I see several possible improvements.

For some reason you chose to deviate from the entity framework code fist conventions. One of them is the use of a List instead of an ICollection, another it that you omit to mention the foreign keys.

Use ICollection istead of List

Are you sure that Ticket.TicketFiles[4] has a defined meaning? And what would Ticket.TicketFiles.Insert(4, new TicketFile()) mean?

Better stick to an interface that prohibits usage of functions that have no defined meaning. Use ICollection<TicketFile>. This way you'll have only functions that have a proper meaning in the context of a database. Besides it gives entity framework the freedom to chose the most efficient collection type to execute its queries.

Let your classes represent the tables

Let your classes just be POCOs. Don't add any functionality that is not in your tables.

In entity framework the columns of a table are represented by non-virtual properties. The virtual properties represent the relations between the tables (one-to-many, many-to-many, ...)

Let entity framework decide what's the most efficient to initialize the data in your sequences. Don't use a constructor where you create a List, which will be immediately thrown away by entity framework to replace it with its own ICollection. Don't automatically initialize property Deleted, if entity framework immediately replaces it with its own value.

You will probably have only one procedure where you will add a Ticket to the database. Use this function to properly initialize the field of any "newly added Ticket"

Don't forget the foreign keys

You defined several relations between your tables (one-to-many, or many-to-many?) but you forgot to define the foreign keys. Because of your use of virtual entity framework can understand that it needs foreign keys and will add them, but in your query you need to write e.Status != null && e.Status.Id == statusId, while obviously you could just use the foreign key e.StatusId == statusId. For this you don't have to join with the Statuses table

Another reason to specify the foreign keys: they are real columns in your tables. If you define that these classes represent your tables, they should be in these classes!

Only select the properties you actually plan to use

One of the slower parts of a database query is the transport of the selected data from the database management system to your local process. Hence it is wise to select only the data you actually plan to use.

Example. There seems to be a one-to-many between a User and a Ticket: every User has zero or more Tickets, every Ticket belongs to exactly one User. Suppose User 4 has 20 Tickets. Every Ticket will have a UserId with a value 4. If you fetch these 20 Tickets without a proper Select you will fetch all properties of the same User 4 once per Ticket, and you will transport the data of this same User 20 times (with all his properties, and maybe all his relations). What a waste of processing power!

Always use Select to query your data and Select only the properties you actually plan to use. Only use Include if you plan to updated the Included data.

var tickets = dbContext.Tickets.Where(ticket => !ticket.Deleted

    // improvement: use foreign keys
    && ticket.ProjectId == 0 (or == null, if ProjectId nullable)
    && ticket.StatusId == statusId)       // no Join with Statuses needed

    .Select(ticket => new
    {
        ... 
    }
Harald Coppoolse
  • 28,834
  • 7
  • 67
  • 116
  • thank you very much for all that insight Harald. This is very useful for me as a beginner. I found out the issue was actually the fact that I was fetching all the properties, and the Messages property was quite big. So even with a simple query, less than 100 records, the total size was over 1MB. I will mark your answer and the correct one though, as all of your tips will be very useful. – Ricardo Melo Feb 14 '19 at 17:20