0

I'm trying to get data from some tables in one SQL query and fill in a model. Things that are important:

  • Count of all blogs
  • List of all blogs created by the user
  • Count of all interviews
  • List of all interviews created by the user
  • Count of all candidates assigned to the Candidate role
  • List of all candidates created by the user
public ApplicationUserActivitiesVM GetUserActivities(string userId)
{
    var result = from dummyRow in new List<string> { "X" }
                    join blog in _context.Blog.Where(w => w.Activity == true) on 1 equals 1 into bg
                    join ublog in _context.Blog.Where(w => w.Activity == true && w.AuthorId == userId) on 1 equals 1 into ubg
                    join interview in _context.CandidateInterview.Where(w => w.Activity == true) on 1 equals 1 into iw
                    join uinterview in _context.CandidateInterview.Where(w => w.Activity == true && (w.CreatorId == userId || w.InterviewerId == userId)).Include(i => i.Candidate) on 1 equals 1 into uiw
                    join candidate in _context.UserRoles.Where(w => w.RoleId == "candidate role id") on 1 equals 1 into ce
                    join ucandidate in _context.Users.Where(w => w.UserCreatorId == userId && w.Activity == true).Include(i => i.CandidateInterviewCandidates).ThenInclude(t => t.Candidate) on 1 equals 1 into uce
                    select new ApplicationUserActivitiesVM()
                    {
                        BlogsCount = bg.Count(),
                        CandidatesCount = ce.Count(),
                        InterviewsCount = iw.Count(),
                        UserBlogsCount = ubg.Count(),
                        UserInterviewsCount = uiw.Count(),
                        UserCandidatesCount = uce.Where(w => ce.Any(c => c.UserId == w.Id)).Count(),
                        UserInterviews = uiw.Select(s => new UserActivityInterview(s)).ToList(),
                        UserBlogs = ubg.Select(s => new UserActivityBlog(s)).ToList(),
                        UserCandidates = uce.Where(w => ce.Any(c => c.UserId == w.Id)).Select(s => new UserActivityCandidate(s)).ToList()
                    };

    return result.FirstOrDefault();
}

the last join prepares all users created by the user I pass to the method but I would like to get all users in candidate role that were created by the given userId.

Is it any way to improve my query?

To reach that goal I used Where(w => ce.Any(c => c.UserId == w.Id)) twice in select segment. I think it is not an optimise way.

Any help in optimizing the code will be appreciated.

After reading the comments, I edited the code as follows:

ViewModel that I use on a page:

public class ApplicationUserActivitiesVM
{
    public int BlogsCount { get; set; }
    public int CandidatesCount { get; set; }
    public int InterviewsCount { get; set; }
        
    public IList<UserActivityBlog> UserBlogs { get; set; }
    public IList<UserActivityCandidate> UserCandidates { get; set; }
    public IList<UserActivityInterview> UserInterviews { get; set; }

    public int UserBlogsCount => UserBlogs.Count;
    public int UserCandidatesCount => UserCandidates.Count;
    public int UserInterviewsCount => UserInterviews.Count;
}

Method that I edited:

public ApplicationUserActivitiesVM GetUserActivities(string userId)
{
    var blogsCount = _context.Blog.Where(w => w.Activity == true).Count();
    var userBlogs = _context.Blog.Where(w => w.Activity == true && w.AuthorId == userId).Select(s => new UserActivityBlog(s)).ToList();
    var interviewsCount = _context.CandidateInterview.Where(w => w.Activity == true).Count();
    var userInterviews = _context.CandidateInterview.Where(w => w.Activity == true && (w.CreatorId == userId || w.InterviewerId == userId)).Include(i => i.Candidate).Select(s => new UserActivityInterview(s)).ToList();
    var candidates = _context.UserRoles.Where(w => w.RoleId == "candidate role id");
    var candidatesCount = candidates.Count();
    var userCandidates = _context.Users.Where(w => w.UserCreatorId == userId && w.Activity == true).Where(w => candidates.Any(a => a.UserId == w.Id)).Include(i => i.CandidateInterviewCandidates).ThenInclude(t => t.Candidate).Select(s => new UserActivityCandidate(s)).ToList();

    var result = new ApplicationUserActivitiesVM
    {
        BlogsCount = blogsCount,
        UserBlogs = userBlogs,
        InterviewsCount = interviewsCount,
        UserInterviews = userInterviews,
        CandidatesCount = candidatesCount,
        UserCandidates = userCandidates
    };

    return result;
}

In terms of performance, I compared the two methods and discovered that they are nearly identical; however, given the database currently has little data, it may need to be enhanced in the future and compared again.

Please let me know if there is a better way.

Mahmood
  • 120
  • 2
  • 9
  • You can have `UserCandidatesCount` as a readonly property of `ApplicationUserActivitiesVM` class returnning `UserCandidates.Count()` – Serg Jan 05 '22 at 10:45
  • Wire your nav props up properly, Include the related data and you're sorted; you don't need the DB to count records you download into a collection - you can just ask the collection. probably easier on EFC5 with filtered include but I don't think it truly matters here – Caius Jard Jan 05 '22 at 10:45
  • Thanks, @Serg I will do that. But my issue is in the last join that returns **all users instead of all candidates** created by the given userId. – Mahmood Jan 05 '22 at 10:50

1 Answers1

2

Many of these are completely separate details from one another. They should be separate, simple queries not joins. With databases it isn't the number of queries that matters, it is the complexity and touch points of each query that matters.

The first thing is that where you want to return blogs, interviews, etc. for a particular user, you will want to ensure that those entities have their relationship to User mapped with a navigation property. EF is a mapper between your .Net domain and the database. It isn't intended as a library to litter C# code with SQL-like Linq expressions. Join should only ever be needed in extremely rare cases where you have a very loosely associated relationship between entities, for example any situation where you cannot have a FK set up between the two underlying tables.

Most, if not all relationships should be managed via navigation properties. This informs EF about how tables are related to one another and you can then let EF build the queries necessary to draw out the data you actually need.

var blogCount = _context.Blog
    .Where(x => x.Activity == true)
    .Count();
var userBlogs = _context.Blog
    .Where(x => x.Activity == true
        && x.Creator.UserId == userId)
    .Select(x => new UserActivityBlog
    { // TODO: Populate the container.
    }).ToList();
var interviewCount = _context.CandidateInterview
    .Where(x => x.Activity == true)
    .Count();
var userInterviews = _context.CandidateInterview
    .Where(x => x.Activity == true
        && (x.Creator.UserId == userId 
            || x.Interviewer.UserId == userId))
    .Select(x => new UserActivityInterview
    { // TODO: Populate the container
    }).ToList();
var candidateCount = context.CandidateInterview
    .Where(x => x.Activity == true)
    .SelectMany(x => x.Candidates)
       .Where(x => x.User.RoleId == roleId)
   .Count();
var userCandidates = _context.CandidateInterview
    .Where(x => x.Activity == true)
    .SelectMany(x => x.Candidates)
        .Where(x => x.User.RoleId == roleId
            && x.Creator.UserId == userId)
    .Select(x => new UserActivityCandidate
    { // TODO: Populate the container
    }).ToList();

var result = new ApplicationUserActivitiesVM
{
    BlogCount = blogCount,
    UserBlogs = userBlogs,
    InterviewCount = interviewCount,
    UserInterviews = userInterviews,
    CandidateCount = candidateCount,
    UserCandidates = userCandidates

};

Then in your ApplicationUserActivitiesVM view model, you can use property expressions to reveal the User*Count values:

public virtual IList<UserActivityBlog> UserBlogs { get; set; } = new List<UserActivityBlog>();
public int UserBlogCount => UserBlogs.Count;
// etc.

I had to take a guess at the relationship between Candidates, InterviewCandidates, and the User/Role relationships, but hopefully the underlying idea of doing several simpler queries rather than joining all of these tables together for unrelated counts etc. makes sense.

Joining all of these tables together where you want total counts alongside data specific to a user would produce a huge Cartesian product between rows, and likely result in a lot of table-scan operations rather than using indexes. The Counts about can be done with index scans while not polluting the user-specific query results.

I was a bit suspicious of the use of constructors within the original Select and whether that would trigger client side evaluation, but it appears not /w EF Core 5 i.e:

.Select(s => new UserActivityInterview(s))

So-long as the projected view model / DTO only touches the associated entity properties and not things like child/related entities it seems to work. (Learned something new :) Normally I would use Automapper and ProjectTo to handle this, and it can be set up to project / flatten child / related entities if you have a need for that.

Steve Py
  • 26,149
  • 3
  • 25
  • 43
  • Thank you so much @StevePy. The reason I tried to write the whole query in one, was the performance and using all data from the query on one page. [I learned here](https://stackoverflow.com/questions/21182116/select-records-count-from-multiple-tables-in-a-single-query). Is it true that the database connection is opened and closed numerous times in the way you described it? Do you know a way I can compare query speeds? – Mahmood Jan 06 '22 at 08:44
  • The `DbContext` instance manages the database connection, so running several smaller, fast queries is often much better than trying to get everything at once. Attempts to minimize code/calls can lead to returning far too much data, resulting in higher memory use and data over the wire which kills performance. When it comes to measuring and tuning query performance I use a profiler on the database to inspect what queries are running. From there I can look at suspect queries that are slow or doing a lot of reads, then run them manually to get an execution plan to inspect. – Steve Py Jan 06 '22 at 21:25