2

I have the following LINQ query:

//two different contexts, databases, tables...
NoteSet = lmCtx.LMNotes.AsEnumerable();
EmpSet = tessCtx.Employees.AsEnumerable();

var lmAccountNotes = (from lmnote in NoteSet
                       join createdby in EmpSet on lmnote.lnt_createdById equals createdby.EmployeeID
                       join modifiedby in EmpSet on lmnote.lnt_modifiedById equals modifiedby.EmployeeID
                       where lmnote.lnt_recordId == 5566 && lmnote.lnt_tableId == 1
                       select new NoteInfo { Note = lmnote, CreatedBy = createdby, ModifiedBy = modifiedby }).ToList();

This works for queries on small tables, but NoteSet is a pretty big table and I'm reaching well over 1.5GB of used memory by the process before the framework just explodes and throw an OutOfMemory exception.

Is there any way to keep the lazy loading feature while executing something like this ?

Francis Ducharme
  • 4,848
  • 6
  • 43
  • 81
  • 1
    At first glance, you're materializing everything with `.AsEnumerable()` then querying that, which might be the cause of the memory issue. By removing that and using it as a standard `IQueryable` interface, your memory usage should noticably decrease. – Scott Aug 25 '16 at 20:52
  • Yes, but then I get the error: `The specified LINQ expression contains references to queries that are associated with different contexts`. This is why I used `AsEnumerable()` – Francis Ducharme Aug 25 '16 at 20:57
  • By using `AsEnumerable()` you are essentially running your whole “query” in memory. So you are fetching all data from the database, and then join them within your application. If you want to run the query efficiently on the database, then you need to use `IQueryable` and thus you need to have a single database context that has access to both entities. There’s no way around it. – poke Aug 25 '16 at 20:59
  • @poke Does EF support handling tables from different tables right in the EDMX model ? Like an EDMX model with more than 1 connection string ? – Francis Ducharme Aug 25 '16 at 21:00
  • EF translates LINQ queries into SQL to be run on the database - not sure standard querying is supported across multiple databases unless you set up [some weird stuff like this](http://stackoverflow.com/questions/20611310/query-across-multiple-databases-on-same-server). – Scott Aug 25 '16 at 21:03
  • Not sure what you mean with *“handling tables from different tables”*, and I actually never used EDMX models, so I can’t answer that. But what do you mean, more than one connection string? Are these tables in different databases? Then you won’t be able to do this. A database context can reference multiple tables but they share the same connection to the same database; you can’t make cross-database queries. In general database design, if you have related data, that’s a clear sign that they should be in the same database. – poke Aug 25 '16 at 21:03
  • @poke yes, I meant tables from different databases. Typo, sorry. – Francis Ducharme Aug 25 '16 at 21:03

2 Answers2

0

So as to keep having a query that returns a NoteInfo object, I changed it to this:

//LMNotes is the actual huge database...
var m = lmCtx.LMNotes.Where(x => x.lnt_recordId == 5566).ToList();

var lmAccountNotes = (from lmnote in m
             join createdby in EmpSet on lmnote.lnt_createdById equals createdby.EmployeeID
             join modifiedby in EmpSet on lmnote.lnt_modifiedById equals modifiedby.EmployeeID
             where lmnote.lnt_recordId == 566 && lmnote.lnt_tableId == 1
             select new NoteInfo { Note = lmnote, CreatedBy = createdby, ModifiedBy = modifiedby }).ToList();

This is better

Francis Ducharme
  • 4,848
  • 6
  • 43
  • 81
0

As explained in the comments, you cannot really run a single query across two different database, at least not without setting up some help construct (which would then live on either database, and actually who knows if that would improve the performance at all).

However, that does not mean that we cannot improve your query at all. If we can’t rely on the database engine to execute the query, we can do that ourselves. In this case, what you are doing is essentially just a query on the LMNotes entity and then you join employees from the Employees set.

So a naive solution could look like this:

var notes = lmCtx.LMNotes
    .Where(lmnote => lmnote.lnt_recordId == 5566 && lmnote.lnt_tableId == 1)
    .Select(lmnote =>
    {
        return new NoteInfo
        {
            Note = lmnote,
            CreatedBy = tessCtx.Employees.FirstOrDefault(e => e.EmployeeId == lmnote.lnt_createdById),
            ModifiedBy = tessCtx.Employees.FirstOrDefault(e => e.EmployeeId == lmnote.lnt_modifiedById)
        };
    })
    .ToList();

Of course, while this runs a single query on LMNotes, this still runs two separate queries for each note in the result. So it’s not really better than what EF would have done up there.

What we can do however is add some lookup. I suspect that the set of employees is somewhat limited, so it would make sense to only fetch each employee once. Something like this:

private Dictionary<int, Employee> employees = new Dictionary<int, Employee>();

private Employee GetEmployee(int employeeId)
{
    Employee employee;
    if (!employees.TryGetValue(employeeId, out employee))
    {
        employee = tessCtx.Employees.FirstOrDefault(e => e.EmployeeId == employeeId);
        employees[employeeId] = employee;
    }
    return employee;
}

public List<NoteInfo> GetNotes()
{
    return lmCtx.LMNotes
        .Where(lmnote => lmnote.lnt_recordId == 5566 && lmnote.lnt_tableId == 1)
        .Select(lmnote =>
        {
            return new NoteInfo
            {
                Note = lmnote,
                CreatedBy = GetEmployee(lmnote.lnt_createdById),
                ModifiedBy = GetEmployee(lmnote.lnt_modifiedById)
            };
        })
        .ToList();
}

This would only look up each employee once and then cache the employee object.

Alternatively, you could also make a second pass here and fetch all employees at once after reading the notes for the first time. Something like this:

public List<NoteInfo> GetNotes()
{
    var employeeIds = new HashSet<int>();

    var notes = lmCtx.LMNotes
        .Where(lmnote => lmnote.lnt_recordId == 5566 && lmnote.lnt_tableId == 1)
        .Select(lmnote =>
        {
            // remember the ids for later
            employeeIds.Add(lmnote.lnt_createdById);
            employeeIds.Add(lmnote.lnt_modifiedById);

            return new NoteInfo
            {
                Note = lmnote,
                CreatedBy = null,
                ModifiedBy = null
            };
        })
        .ToList();

    var employees = tessCtx.Employees
        .Where(e => employeeIds.Contains(e.EmployeeId))
        .ToList()
        .ToDictionary(e => e.EmployeeId);

    foreach (var noteInfo in notes)
    {
        noteInfo.CreatedBy = employees[noteInfo.Note.lnt_createdById];
        noteInfo.ModifiedBy = employees[noteInfo.Note.lnt_modifiedById];
    }

    return notes;
}

This would only run a single query against each database.

poke
  • 369,085
  • 72
  • 557
  • 602