Background
So, I am using a React frontend and a .net core 3.1 backend for a webapp where I display a view with a list of data. The list is often times several thousands long. In this case its around 7500. We virtualize it to prevent sluggishness. Along with the display of data, every row has a column with the latest logchange someone did on that datarow. The logs and the rest of the data for every row comes from two different applications with their own databases. The log data consists of the name, and date of when the log was made, is also supposed to be rendering for every row.
The problem
When you route to the page, a useEffect fires that fetches the rows from one of the databases. When I get the response, I filter out all of the ids from the data and then I post that list to the other endpoint to request the latest log from every id. This endpoint queries the logging database. The number of ids I am passing to the endpoint is about 7200+. It wont always be this much, but sometimes.
Troubleshooting
This is the query that is giving me trouble in the log endpoint
public async Task<IActionResult> GetLatestLog(ODataActionParameters parameters)
{
var LogIds= (LogIds)parameters["LogIds"];
var results = await context.Set<LogEvent>()
.Where(x => LogIds.Ids.Contains(x.Id)).ToListAsync(); //55 600 entities
results = results
.GroupBy(x => x.ContextId)
.Select(x => x.OrderByDescending(p => p.CreationDate).First()).ToList(); //7 500 entities
var transformed = results.Select(MapEntityToLogEvent).ToList();
return Ok(transformed);
}
The first db query takes around 25 seconds (!) and returns around 56000 entities.
The second linq takes about 2 seconds, and returns around 7500 entites, and the mapping takes around 1 second.
The database is SQL server, and there are three indexes, one of which is Id, the other two are irrelevant for this assignment.
I have tried different queries, AsNoTracking, but to no avail.
Obviously this is horrible. Do you know of a way to optimize this query?