Entity framework and local cache of database tables

Question

I am developing an application that uses Entity Framework, but I have some performance issues. Imagine a user interface (C#, WPF) with some combo boxes and a data grid. Each time a value is selected in a combo box it changes the conditions for the data to show in the grid. It looks like the entity framework is not as flexible as I thought when it comes to caching. Because of the changed conditions the underlying sql will always be sligthly different (= no EF caching) and each cell update will result in a request to the database.

Is there any way I can cache the tables locally (with working navigation properties) and still use linq for selection etc. without generating any requests for the database?

The tables are small so I don't expect any performance problems. (I guess local sorting etc. otherwise could be a problem since the database indexes are not used.)
I don't need write access to the tables so if there is an easy way to make a deep copy of the tables and detach from the database connection that would perhaps be ok.
I am not allowed to install any third party tools.

Perhaps entity framework was a bad choice from the beginning, but it's really convenient to use those generated classes and linq instead of manually writing a lot of classes and sql. (That I still would have to implement some cache for.)

I think you should show a bit of your code. Sounds like a simple ToList() on your Queryable before feeding it to the comboboxes and data grid (effectively switching from EF linq to in-memory linq) might go a long way towards solving your problem. — Paul-Jan, Jan 24 '17 at 18:19
EntityFramework(EF) does not have to be used once you get the objects. It is merely an object model layer for getting data. LINQ is seperate from EF as EF is objectifying a database for data retrieval. You can simple set up a repository pattern where you do a retrieve method on a set duration and cache those objects. I can easily do var ls = new List{ "a", "b", "c", "d", "e"}. ls.Distinct().Skip(2).Take(3).ToList().ForEach(x => Console.WriteLine($"I have {x}")); EF has no bearing on Linq and it's use, LINQ is a framework of extension methods for querying, EF is for data retreival — djangojazz, Jan 24 '17 at 19:29
Also you hit the nail on the head here: 'but it's really convenient to use those generated classes'. The T4 template(what generates those classes) is where EF really shines IMHO. You can overwrite that code that writes code too make generated data too if you want. EG: Say I have a table colors: and it only stores an id and values 'red', 'blue', 'yellow'. Retrieving that would be kind of dumb so I could even hack the T4 to generate enum generated objects that are what was in the database. Probably not great for production because a database change would require a rerun of the T4 template. — djangojazz, Jan 24 '17 at 19:35
Paul-Jan, That's ok for the combo boxes but it will not be easy for the grid. The entries in the cells combines data from many tables in different ways based on values from other tables, so it will be hard to get the data for all columns in a single query. (But probably not impossible.) — AppleMoose, Jan 24 '17 at 19:40
Djangojazz, The problem is that even if I have retreived all entries it seems that it triggers a database request when I follow the navigation properties, e.g. "tableA.ToList().Where(someCondition)" will not make a call to the database if I already have tableA entries in cache but something like "tableA.ToList().navigationPropertyRefTableB.Where(someCondition).something" seems to always result in a call to the database even if all requested data has already been rereived. (I have tried to use "Include" and "ToList" everywhere where it should matter.) — AppleMoose, Jan 24 '17 at 19:57
By the way, I can't post my code here (coding in secured environment, no internet access), I might make some example code later. — AppleMoose, Jan 24 '17 at 20:00
I think you are confusing opening a context with storing the data from the context itself. Not sure, but that is what it seems like. — djangojazz, Jan 24 '17 at 21:38

score 15 · Accepted Answer · edited May 23 '17 at 10:30

Is there any way I can cache the tables locally?

This is what a DbContextdoes by default and there is an easy way for you to use that feature. Here is the basic pattern to follow:

context.Products.Where(p => <some intial condion>).Load();
var dataSource = context.Product.Local.Where(p => <some flexible condition>);

Note that in line 2 the Localcollection is used. This is a DbSetproperty that returns entities from the context's cache.

with working navigation properties

Any related entities that are loaded by the Load()statement will be automatically connected to one another by relationship fixup. So if a Product has a collection Components, you can do:

context.Components.Where(c => <some intial condion>).Load();

If this loads all components of the products that were loaded above, you'll see that their Components collection are now populated.

An alternative combining both steps is:

context.Products.Where(p => <some intial condion>)
       .Include(p => p.Components)
       .Load();

If there are many related tables you have to find a balance between individual Load statements and Load statements with Include, because many Includes in one statement may hit performance.

and still use linq for selection

As shown above: the flexible condition.

without generating any requests for the database

If you will always address Local collections only, these statements will never query the database. However, addressing navigation properties may still cause lazy loading. If you do ...

context.Products.Where().Load();
context.Components.Where().Load();

... this does populate product.Components collections, but doesn't mark them as loaded, whereas context.Products.Include(p => p.Components) does. So in the first case, addressing product.Components will trigger lazy loading. Similarly, addressing navigation properties for which the entities aren't loaded at all will also trigger lazy loading. So to be absolutely sure that no database interaction is triggered, you should disable lazy loading, either by ...

context.Configuration.LazyLoadingEnabled = false;

... or ...

context.Configuration.ProxyCreationEnabled = false;

The latter option forces EF to create simple POCO objects that are not capable of lazy loading.

So using these techniques, you can use your context as a local cache of connected entities. Which is an exception to the rule that a context should be short-lived.

One caution

Relationship fixup doesn't work for many-to-many associations. Suppose there is a m:n relationship between Product and Manufacturer, then ...

context.Products.Where().Load();
context.Manufacturers.Where().Load();

... won't populate product.Manufacturers and manufacturer.Products. Many-to-many associations should be loaded by Include:

context.Products.Where()
       .Include(p => p.Manufacturers)
       .Load();

How can I mark the entities as loaded? I want to my model to support lazy-loading, but want to be able to eager-load the data and it don't trigger lazy-load anymore. When I do this, when I access "product.Components" (as you said), it triggers lazy loading. So how can I fix that? Also... is there anything to fix the same problem but in many-to-many relationships? I'm using EF Core 3.x — Iúri dos Anjos, Apr 14 '20 at 20:36
You can't mark entities as loaded. They're marked as loaded by EF when EF loads navigation properties, either (and exclusively) by `Include`, `Load` (of a navigation property), or lazy loading. The only thing you can do is turn off lazy loading. — Gert Arnold, Apr 14 '20 at 20:47

score 0 · Answer 2 · answered Jan 24 '17 at 21:49

Let me take a crack at this a little bit as I also have production apps I work on in WPF and follow an MVVM pattern. You may not, I suggest it if you don't know what I am talking about. Say I have a database table that has a person table and it only has three columns: PersonId, FirstName, LastName. I only have two rows currently, my name and my wife's name. I want to retrieve the data ONLY once but then I may want to alter it later. This is a simplified example of course:

XAML:

<StackPanel>
    <DataGrid ItemsSource="{Binding People}" AutoGenerateColumns="False">
      <DataGrid.Columns>
        <DataGridTextColumn Header="PersonId" Binding="{Binding PersonId}" />
        <DataGridTextColumn Header="First Name" Binding="{Binding FirstName}" />
        <DataGridTextColumn Header="Last Name" Binding="{Binding LastName}" />
      </DataGrid.Columns>
    </DataGrid>
    <TextBox Text="{Binding Text}" />
    <Button Command="{Binding CommandGetFirstName}" Height="30" Content="Get By First Name Above" />
</StackPanel>

This is bound using MVVM so my MainViewModel would be this:

using System;
using System.Collections.Generic;
using System.Collections.ObjectModel;
using System.ComponentModel;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Input;

namespace WPFCSharpTesting
{
  public class MainWindowViewModel : INotifyPropertyChanged
  {
    private string _text;

    public string Text
    {
      get { return _text; }
      set
      {
        _text = value;
        OnPropertyChanged(nameof(Text));
      }
    }

    private ObservableCollection<tePerson> _people;

    public ObservableCollection<tePerson> People
    {
      get { return _people; }
      set
      {
        _people = value;
        OnPropertyChanged(nameof(People));
      }
    }

    private readonly List<tePerson> _allPeople;


    public MainWindowViewModel()
    {
      Text = "Brett";       
      using (var context = new TesterEntities())
      {
        _allPeople = context.tePerson.ToList();
      }

      People = new ObservableCollection<tePerson>(_allPeople);
    }

    public event PropertyChangedEventHandler PropertyChanged;

    private void OnPropertyChanged(String info)
    {
      PropertyChanged?.Invoke(this, new PropertyChangedEventArgs(info));
    }

    DelegateCommand _CommandGetFirstName;

    public ICommand CommandGetFirstName
    {
      get
      {
        if (_CommandGetFirstName == null)
        {
          _CommandGetFirstName = new DelegateCommand(param => this.CommandGetByFirstNameExecute());
        }
        return _CommandGetFirstName;
      }
    }

    private void CommandGetByFirstNameExecute()
    {
      var newitems = _allPeople.Exists(x => x.FirstName == Text) ? _allPeople.Where(x => x.FirstName == Text)?.ToList() : _allPeople;
      People = new ObservableCollection<tePerson>(newitems);
    }

The key piece here is what is happening in my constructor. I am taking a readonly variable, _allPeople, that is private and storing the info there that I want to manipulate later. Once _allPeople has the data, I do not need to touch the 'context' again to hit the database. I just can now monkey with _allPeople as it's own collection detached for what I need. When I want to expose to my front end WPF what the user sees they will see an observable collection I can update as needed from my cached set up. This is a super simple over simplification of it. Typically many developers end up doing a whole repository pattern where they have a project or projects related to ONLY being for storing data and performing CRUD operations. This is generally a preferred method IMHO as you can piece together other things as needed.

Entity framework and local cache of database tables

2 Answers2