How can I reduce the amount of SQL-querying in my code?

Question

I'm writing a big C# application that communicates with a MS-SQL Server database.

As the app grows bigger, I find myself writing more and more "boilerplate" code containing various SQL queries in various classes and forms like this:

using System;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Windows.Forms;

public class SomeForm : Form
{
    public void LoadData(int ticketId)
    {
        // some multi-table SQL select and join query
        string sqlQuery = @"
            SELECT TOP(1) [Foo].Id AS FooId, [Foo].Name AS FooName, [Foo].Address AS FooAddress
                        [Bar].Name AS BarName, [Bar].UnitPrice AS BarPrice,
                        [Bif].Plop
            FROM        [dbo].[Foo]
            INNER JOIN  [dbo].[Bar]
            ON          [Bar].Id = [Foo].BarId
            INNER JOIN  [dbo].[Bif]
            ON          [Bar].BifId = [Bif].Id
            WHERE       [Foo].TicketId = @ticketId";

        SqlCommand sqlCmd = new SqlCommand();
        sqlCmd.CommandText = sqlQuery;
        sqlCmd.Parameters.AddWithValue("@ticketId", ticketId);

        // connection string params etc and connection open/close handled by this call below
        DataTable resultsDataTable = SqlQueryHelper.ExecuteSqlReadCommand(sqlCmd);

        if (resultsDataTable.Rows.Count > 0)
        {
            var row = resultsDataTable.Rows[0];

            // read-out the fields
            int fooId = 0;
            if (!row.IsNull("FooId"))
                fooId = row.Field<int>("FooId");

            string fooName = "";
            if (!row.IsNull("FooName"))
                fooName = row.Field<string>("FooName");

            // read out further fields...

            // display in form
            this.fooNameTextBox.Text = fooName;
            // etc.
        }
    }
}

There are dozens of forms in this project all doing conceptually the same thing, just with different SQL queries (different columns selected, etc.) And each time the forms are opened, the database is being continually queried.

For a local DB server the speed is OK but using the app over a slow VPN is painful.

Are there better ways of cutting down the amount of querying the database? Some sort of caching the database in memory and performing the queries on the in-memory data?

I've added some data tables to a data source in my project but can't understand how I can do complex queries like the one stated above.

Is there a better way of doing this?

Thanks for all your suggestions folks!

You could use [Entity Framework](https://learn.microsoft.com/en-us/ef/) — Sandris B, Apr 15 '20 at 12:14
Every solution to this is going to be hitting the database when it needs to access data. I think you should look a refactoring data retrieval to try and get as much data in a little amount of queries as you can. — KingOfArrows, Apr 15 '20 at 12:24

BionicCode · Accepted Answer · 2020-04-16T00:47:07.047

I think non of the recommendations like using a framework to access a datase will solve your problem. The problem is not writing SQL or LINQ queries. You always have to dispatch a query to the database or set of data at some point.

The problem is from where you query the database.
Your statement about writing "code containing various SQL queries in various classes and forms" gives me chills. I recently worked for a company and they did exactly the same. As a result they can't maintain their database anymore. Maintenance is still possible, but only rudimentary and very expensive/time consuming and very frustrating - so nobody likes to do it and therefore nobody does it and as a result it gets worse and worse. Queries are getting slower and slower and the only possible quick fix is to buy more bandwidth for the database server.

The actual queries are scattered all over the projects making it impossible to improve/refactor queries (and identify them) or improve the table design. But the worst is, they are not able to switch to a faster or cheaper database model e.g. a graph structured database, although there is a burning desire and an urgent need to do so. It makes you look sloppy in front of customers. Absolutely no fun at all to work in such an environment (so I left). Sounds bad?

You really should decouple the database and the SQL from your business code. All this should be hidden behind an interface:

IRepository repository = Factory.GetRepository();

// IRepository exposes query methods, which hide 
// the actual query and the database related code details from the repository client or business logic
var customers = repository.GetCustomers();

Spreading this code through your project doesn't hurt. It will improve maintainability and readability. It separates the data persistence from the actual business logic, since you are hiding/encapsulating all the details like the actual database, the queries and the query language. If you want to switch to another database you just have to implement a new IRepository to modify the existing queries. Changing the database won't break the application.

And all queries are implemented in one location/layer. And when talking about queries, this includes LINQ queries as well (that's why using a framework like Entity Framework doesn't solve your problem). You can use dependency injection or Factory pattern to distribute the implementation of IRepository. This even allows to switch between different databases during runtime without recompiling the application.

Using this pattern (Repository Pattern) also allows to decouple frameworks like Entitiy Framework from your business logic.

Take a look at the Repository Pattern. If it's not too late you should start to refactor your existing code. The price is too high if keep it like it is.

Regarding caching I know that the database or the DBMS already handles the caching of data very very efficiently. What you can do is to

cache data locally e.g. if the data won't change remotely (because some other client will modify it) e.g. local client settings. This way you can reduce network traffic significantly.
cache data locally e.g. if you access this data set frequently and remote changes are not likely to occur or will have impact. You can update the local data store periodically or invalidate it e.g. after a period of time to force the client to dispatch a new query in order to update the cache.
You also can filter data locally using LINQ. This may also reduce network traffic although you may end up reading more data the necessary. Also filtering is generally done more efficient by the DBMS.
You can consider to upgrade the database server or VPN to increase bandwidth.
Indexing will also improve lookup times significantly.
You should consider to refactor all your SQL queries. There are many articles on how to improve performance of SQL queries. The way you build up a query will have significant impact on performance or execution times especially with big data.
You can use data virtualization. It doesn't make sense to pull thousands of records from the database when you can only show 20 of them to the user. Pull more data when the user scrolls the view. Or even better, display a pre-selected list e.g. of most recent items and the allow the user to search for the data of interest. This way you read only the data that the user explicitly asked for. This will improve overall performance drastically, since usually the user is only interested in very few records.

Before introducing an interface (Dependency Inversion)

The following examples are meant to show, that this is a question of architecture or design, instead a question of frameworks or libraries. Libraries or frameworks can help on a different level, but won't solve the problem, which is introduced by spreading environment specific queries all over the business code. Queries should always be neutral. When looking at the business code, you shouldn't be able to tell if the data is fetched from a file or a database. This details must be hidden or encapsulated.

When you spread the actual database access code (whether directly using plain SQL or with help of a framework) throughout your business code, you are not able to write unit tests without a database attached. This is not desired. It makes testing too complicated and the tests will execute unnecessarily slow. You want to test your business logic and not the database. This are separate tests. You usually want to mock the database away.

The problem:
you need data from the database in multiple places across the application's model or business logic.

The most intuitive approach is to dispatch a database query whenever and where ever you need the data. This means, when the database is a Postgre database, all the code would of course use PostgreSQL or some framework like Entity Framework or ORM in general. If you decide to change the database or DBMS e.g. to some Oracle or want to use a different framework to manage your entities, you would be forced to touch and rewrite every code that uses PostgreSQL or Entity Framework.

In a big business application, this will be the reason that forces your company to stay with what you have and leaves your team dreaming of a better world. Frustration level will rise. Maintaining database related code is nearly impossible, error prone and time consuming. Since the actual database access is not centralized, rewriting the database related code means to crawl through the complete application. Worst case is the spreading of meaningless SQL query strings, nobody understands or remembers. Impossible to move to a new database or to refactor queries to improve performance, without scarifying valuable and expensive time and team resources.

Imaging the following simplified symbolic method is repeated in some form across the application's business logic, maybe accessing different entities and using different filters, but using the same query language, framework or library. Let's say we find similar code a thousand times:

private IEnumerable GetCustomers()
{
  // Use Entity Framework to manage the database directly
  return DbContext.Customers;
}

We have introduced a tight coupling to the framework as it is woven deep into our business code. The code "knows" how the database is manged. It knows about Entity Framework as it has to use its classes or API everywhere.
The proof is, that if you would want to replace Entity Framework with some other framework or just want to drop it, you would have to refactor the code in thousand places - everywhere you used this framework in your application.

After introducing an interface (Dependency Inversion) and encapsulating all the database access

Dependency Inversion will help to remove a dependency on concrete classes by introducing interfaces. Since we prefer loose coupling between components and classes to enhance flexibility, testability and maintainability when using a helper framework or plain SQL dialect, we have to wrap this specific code and hide it behind an interface (Repository Pattern).

Instead of having a thousand places, which explicitly use the database framework or SQL or LINQ queries to read, write or filter data, we now introduce interface methods e.g GetHighPriorityCustomers and GetAllCustomers. How the data is supplied or from which kind of database it is fetched are details, that are only known to the implementation of this interface.

Now the application no longer uses any framework or database specific languages directly:

interface IRepository
{
  IEnumerable<Customer> GetHighPriorityCustomers();
  IEnumerable<Customer> GetAllCustomers();
}

The previous thousand places now look something like:

private IRepository Repository { get; } // Initialized e.g. from constructor
private IEnumerable GetCustomers()
{
  // Use a repository hidden behind an interface.
  // We don't know in this place (business logic) how the interface is implemented
  // and what classes it uses. When the implementation changes from Entity Framework to something else,, no changes have to be made here (loose coupling).    
  return this.Repository.GetAllCustomers();
}

The implementation of IRepository:

class EntityFrameworkRepository : IRepository
{

  IEnumerable<Customer> GetAllCustomers()
  {
    // If we want to drop Entity Framework, we just have to provide a new implementation of IRepository
    return DbContext.Customers;
  }

  ...
}

Now, you decide to use plain SQL. The only change to make is to implement a new IRepository, instead of changing thousand places to remove the specialized code:

class MySqlRepository : IRepository
{
  // The caller still accesses this method via the interface IRepository.GetAllCustomers()
  IEnumerable<Customer> GetAllCustomers()
  {
    return this.Connection.ExecuteQuery("SELECT * FROM ...");
  }

  ...
}

Now you decide to replace MySQL with Microsoft SQL. All you have to do is to implement a new IRepository.
You can swap in and out any database and change the query language or introduce helper frameworks without affecting your original business logic. Once written, never touched again (at least for the changes regarding the database).
If you move the implementation to a separate assembly, you can even swap them at runtime.

2 Things: 1 EF can NOT be decoupled from the BL.... Just try to run it without a reference to EF in the BL.... Does it work? The answer is no even if you use interfaces. 2 The Sharp Factory framework creates a Repository, Repository Interfaces, Entities without any third party references in the code. So all your recommendations can be achieved with The Sharp Factory without writing a single line of code. — Jonathan Alfaro, Apr 15 '20 at 16:25
@JonathanAlfaro I think you misunderstood the point. 1) Decoupling doesn't mean removing references on assembly level. It means the dependency is moved or encapsulated into an implementation of the repository. Your code now depends on the repository interface and no longer on the specific library e.g. EF. When your code directly references classes of this library you would have to rewrite the code to replace each reference with a reference to the new library classes, in case you want to replace it. — BionicCode, Apr 15 '20 at 16:40
@JonathanAlfaro When depending on an interface, changing the library ("behind" this interface) won't break code. You just have to provide a new implementation of the interface which uses the new library. It has absolutely nothing to do with referencing an assembly. By the way you can move the implementation of the repository into it's own assembly. This is what I would do anyway. — BionicCode, Apr 15 '20 at 16:40
@JonathanAlfaro 2) I didn't say that he has to implement the Repository Pattern by himself. I recommended to use it in order to decouple the database handling from the rest of the code. It's purpose is to solve his exact problem, with scattering database access throughout the application. When he finds a framework that implements this simple pattern then he can use it if he feels like. My recommendation was on architecture level and not on framework or library level. You choose the framework based on your architecture. — BionicCode, Apr 15 '20 at 16:40
you are confused.... Decoupling means NOT having a dependency..... Which means NOT having a reference.... Adding interfaces without removing the reference to the component that holds the classes is nothing but a cosmetic change. — Jonathan Alfaro, Apr 15 '20 at 17:00
So basically you CANNOT decouple your repository from your BL if you use EF because you have to reference EF in your BL.... This is a well known issue. Decoupling means not having a dependency.... otherwise is called coupled or dependent. — Jonathan Alfaro, Apr 15 '20 at 17:03
@JonathanAlfaro Direct access to `DbContext` would introduce a tight coupling to this class. If I want to get rid of EF tomorrow, I would've to modify all this thousand places in my code. But instead my code depends on the interface implementation of `MyInterface`, which today uses EF, but tomorrow uses something else. I can swap out EF without modifying my code. My code doesn't know about EF. It only knows `MyInterface`. I still have a dependency on EF but not my business code. But only the implementation of `MyInterface` depends on EF. — BionicCode, Apr 15 '20 at 17:33
@JonathanAlfaro If decoupling would mean to not have a dependency that it would be equivalent to not using the dependency. This doesn't make sense. You invert the dependency to decouple a system (or introduce loose coupling). When ever you depend on a class, you can't change this class without changing the depending code. When depending on interfaces you don't need to change the depending code when changing the implementation of the interface. — BionicCode, Apr 15 '20 at 17:33
@JonathanAlfaro You can't remove the dependency completely because you are actually using it. But you can decouple this dependency to make the coupling become loose. One way to achieve this is Dependency Inversion. If done so your business logic doesn't depend on the concrete implementation, framework or library but on interfaces or abstractions. — BionicCode, Apr 15 '20 at 17:33
Dependency inversion also implies removing the dependency... That is why you do it... SO that you can create shims and mocks for testing. And when you say "you cant remove completely" that is exactly the problem with EF... if you other solutions then you could completely remove the dependency and reference ONLY the component with the interfaces.... That is the architecturally correct way of doing it. That is what The Sharp Factory does for you. — Jonathan Alfaro, Apr 15 '20 at 18:57
Ok so if your BL " doesn't depend on the concrete implementation, framework or library" why dont you remove the EF reference from the BL..... Simple you cannot because EF is an anti-pattern and it is poorly architected by Microsoft. — Jonathan Alfaro, Apr 15 '20 at 18:58
@JonathanAlfaro My last attempt. Since you are still replying I feel like you want to discuss it. _" Dependency inversion also implies removing the dependency"_: it's Dependency _Inversion_ and not Dependency _Removal_. It's not possible to _remove_ a dependency when you _use_ it. It's about coupling. When you take a look at the UML dependency graph you can see an arrow pointing from _caller_ class to the _called_ class indicating the dependency. — BionicCode, Apr 15 '20 at 21:03
@JonathanAlfaro Now when introducing an interface, which is implemented by the _called_ class, the arrow point from the _caller_ to the interface (no longer to the _called_ class) and from the _called_ class to the interface too. When you draw both versions you see the arrow is inverted, pointing from called to caller (or the interface to be more precise). The dependency was inverted. Which introduces a loose coupling because the _caller_, who previously _knew_ the _called_ class (because the arrow directly connected them) now only knows the interface (that the _called_ has to implement. — BionicCode, Apr 15 '20 at 21:03
@JonathanAlfaro The interface is a contract guaranteeing that there is an implementation that has e.g. a `GetCustomers` method. Please take a look at my answer. I've updated it to provide an example (very simple), which shows how you turn tight coupling into loose coupling, which allows e.g. to drop Entity Framework without modifying your business logic. Because you decoupled the business logic from the actual database code (which uses EF) you can now mock the database and test the pure business logic. I think the internet is full of articles targeting this topic. — BionicCode, Apr 15 '20 at 21:04
@JonathanAlfaro I don't know why you propagate or advertise your framework here. It is a framework. It suffers the same negative side effects when using EF: you are tightly coupled to it. This is how frameworks work: they are always implementing Inversion of Control (IoC). _"you could completely remove the dependency and reference ONLY the component with the interfaces.... That is the architecturally correct way of doing it. That is what The Sharp Factory does for you"_: I wonder how I can remove any references to your beloved framework without breaking my code. — BionicCode, Apr 15 '20 at 21:04
@JonathanAlfaro I am interested on how this is your framework doing it for me. You seem to mix up assembly references with dependencies on class level. _"Simple you cannot because EF is an anti-pattern and it is poorly architected by Microsoft"_: a framework is not a pattern and therefore can't be an anti-pattern. It can force you to implement an anti-pattern. Frameworks usually implement patterns to implement an architecture. — BionicCode, Apr 15 '20 at 21:05
@JonathanAlfaro Some times frameworks also forces the client to implement a certain pattern: e.g. ASP.Net forces you to use the MVC pattern. WPF forces you to use the MVVM pattern. EF hides all the details from the client. You can compare it with plain ADO.Net code. I don't think it is suffers from a poor architecture or forces the client to implement an anti-pattern. If so, then please name it. EF hides a lot complexity from the user, allowing him to write code without deeper database related knowledge. Everything comes with trade-offs. — BionicCode, Apr 15 '20 at 21:05
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/211757/discussion-between-bioniccode-and-jonathan-alfaro). — BionicCode, Apr 15 '20 at 21:05
@JonathanAlfaro Please take a look at my answer. I've updated it to provide an example (very simple), which shows how you turn tight coupling into loose coupling, which allows e.g. to drop Entity Framework without modifying your business logic. — BionicCode, Apr 15 '20 at 21:07
Have you tried it? Try getting those entities from a BL project that has NO reference to EF.... just try it..... I bet you will get a nice exception. Even worst... If your entities have "navigation properties" you could potentially call the database from other layers. Furthermore what are you going to do if you have 900 tables? Write all those "Code First" entities and repositories? — Jonathan Alfaro, Apr 15 '20 at 22:01
@JonathanAlfaro You generally can make calls to a database or any other service from any component or layer, where ever you allow references to your persistence or service API. Layers are not physical boundaries, but abstractions on an architectural level. Layer definitions don't define implementation details - architectural design level never does. You mix everything up. I showed you how to decouple EF or any other helper library from your business logic. Please search the internet to read more about the Repository pattern and Dependency Inversion. — BionicCode, Apr 15 '20 at 23:46
@BionicCode Many thanks for this very insightful answer, which I have accepted as the best solution for my problem. Not only have you answered my question, you've shown me what the real problem is. Thanks for bringing my attention to the Repository pattern; I was regrettably unaware of it. This pattern also helps me to implement another planned feature in the program, so this solution helps me to "hit two birds with the same stone". Thanks again for your time! — Alex Barber, Apr 29 '20 at 08:03

Arsen Khachaturyan · Answer 2 · 2020-04-17T11:48:17.573

1

I can suggest several things:

Instead of using ADO.NET switch to Entity Framework, where you can use LINQ to SQL/LINQ to EF(newer version) and write down queries with simply C# language and not worry about querying SQL.
Use Stored procedures, SQL Functions, Views - written down in SQL Server database, which calls are cached by the SQL Server, which provides more efficient executing, security and more maintainability.
For making your Queries more efficient against database tables, consider using Full-Text Index over the tables which data you use more often in the Filtering operations, like Search.
Use Repository and Unit of Work patterns in your C#(including integration with Entity Framework) code which actually will do exactly the thing you want, i.e. collection several amounts of SQL Queries and sending them to execute by SQL Server at once, instead of sending Queries one by one. This will not only drastically improve performance but will keep your coding as simple as it can.

Note: One of the problems with your queries is related not only to their executions but also on Opening and closing SQL database connections each time you need to execute the particular Query. This problem is solved with the Repository and Unit of Work design patterns approach.
Based on your business needs, use In Memory or Database Caching for the data, which repeats for a lot of users.

edited Apr 17 '20 at 11:48

answered Apr 15 '20 at 12:20

Arsen Khachaturyan

7,904
4
42
42

2

2. this is not how it works. SQL calls are compiled only on first call if you explicitly do not specify OPTION (RECOMPILE). And you do not have execution plan for one view, but for the whole call, which includes things like environment parameters state. – Antonín Lejsek Apr 15 '20 at 12:43
@AntonínLejsek, thanks for correcting me. I've just updated the post. – Arsen Khachaturyan Apr 15 '20 at 13:02
2

+1. I would also mention that all calls should be via asynchronous Tasks, unless you wish for the UI to freeze. – ChrisBD Apr 15 '20 at 14:35
1. Stored Procedures, SQL Functions and Views *do not* provide better efficiency for routine data retrieval operations. I would argue that SQL code in your application is actually *more* maintainable. – Robert Harvey Apr 15 '20 at 14:49
1

2. Your nomenclature is a bit off. Linq to SQL is an old, outdated technology from .NET Framework 3.5. The proper term is "Linq to EF." – Robert Harvey Apr 15 '20 at 14:50
3. Entity Framework *already provides* a Repository and Unit of Work implementation. – Robert Harvey Apr 15 '20 at 14:51
@RobertHarvey, thanks a lot for your suggestions and improvements I've updated the post accordingly. – Arsen Khachaturyan Apr 15 '20 at 15:00
Note that SQL Server caches queries even if the SQL originates from the client. The only practical difference between a stored procedure and an ordinary SQL call are the words CREATE PROCEDURE AS. – Robert Harvey Apr 15 '20 at 15:02
Thanks for these good suggestions. The one that helped me most was to investigate the Repository pattern, which solves a number of problems. We have implemented some stored procedures too. Sadly, time shortages for the project currently prevent me from doing much work on EF, though it is something I can consider for future work. Thanks again! – Alex Barber May 13 '20 at 09:25

score 1 · Answer 3 · answered Apr 15 '20 at 12:29

1

There is another solution besides Entity Framework.

For example you can use The Sharp Factory.

It is a commercial product but it not only maps database objects like Entity Framework but also creates a full Repository based on layers.

It is better than EF in my opinion if you are willing to pay for it.

There are some downsides to Entity Framework. For example the fact that your Repository will leak throughout your layers.

Because you need to reference Entity Framework on all consumers of your entities... So even if your architecture looks correct, at runtime you can still execute sql queries from your upper layers even unknowingly.

answered Apr 15 '20 at 12:29

Jonathan Alfaro

4,013
3
29
32

If you're going to go this route, you're better off using a simple ORM like Dapper, which doesn't have the leaking problem and doesn't already provide a Repository and Unit of Work implementation like EF does. – Robert Harvey Apr 15 '20 at 15:10
@RobertHarvey but you have to write your SQL strings.... which means.... Whats the point? If you have to write entities or sql then what is the point? That is why I do not use NHibertnate nor dapper... both require you to write entities and sql.... What if you have 1000 tables? I often work with database with hundreds or thousands of tables. So Dapper, EF, NHibernate are useless for me – Jonathan Alfaro Apr 15 '20 at 16:22
1

There are many easy ways to create the necessary entities. I have a 60 line SQL script that generates mine. And you'd better get used to writing SQL; ORM's were never meant to completely replace hand-written SQL. See https://learn.microsoft.com/en-us/ef/core/querying/raw-sql – Robert Harvey Apr 15 '20 at 19:34
@RobertHarvey once againg when you have 900 tables... writing lots of sql is not an option. The Sharp Factory can do thousands of tables no problem. If you have an ORM that requires you to write sql or create entities then it its not a very good ORM. – Jonathan Alfaro Apr 15 '20 at 20:42
1

[shrug] Dapper can return `dynamic` objects, if you're so inclined. You don't need entities at all if you do that. Nobody said anything about writing SQL for CRUD operations on 900 tables; Dapper already takes care of that for you. – Robert Harvey Apr 15 '20 at 21:37
Dynamic objects are not very good for traversing layers and they are not strongly typed. Dapper DOES NOT "take care of that".... Have you ever used dapper? With dapper YOU have to write the SQL..... So if you have 900 tables forget about dapper. – Jonathan Alfaro Apr 15 '20 at 21:58
In case you doubt it here: https://dapper-tutorial.net/knowledge-base/54541326/why-use-dapper-if-i-don-t-want-to-write-my-query-on-sql – Jonathan Alfaro Apr 15 '20 at 22:03
I've used Dapper extensively, at two different jobs. No, you don't have to write the SQL yourself. See https://github.com/StackExchange/Dapper/tree/master/Dapper.Contrib. Perhaps you should try using Dapper yourself before offering expert opinions about it. – Robert Harvey Apr 15 '20 at 22:11
@RobertHarvey maybe for very simple straight forward queries.... But for anything meaningful you cannot express anything into it..... For any UPDATE WHERE.... for example... or for any select statement with any meaning... And you are right I have never written an application where you could just perform naked CRUD operations... All my apps do complext SQL queries. – Jonathan Alfaro Apr 16 '20 at 01:59

How can I reduce the amount of SQL-querying in my code?

3 Answers3

Before introducing an interface (Dependency Inversion)

After introducing an interface (Dependency Inversion) and encapsulating all the database access