6

In my application the data in the database must never be deleted.

Entities have a property, Deleted, which can be set to true or false.

Entities which are virtually deleted can not go out from the database, they must not exist from the application perspective.

I am managing this by creating GetAll- methods (for example GetAllUsers) in my data access layer. The methods return only entities which are flaged as NOT deleted (!Deleted), and all other methods (for example GetUserById) retrieved data with these methods.

See the example below...

public IEnumerable<User> GetAllUsers()
{
    return _dataContext.Users.Where(element => !element.Deleted);
}

public User GetUserById(string userName)
{
    return GetAllUsers().FirstOrDefault(elt => elt.UserName.Equals(userName));
}

This architecture is very good from the perpsctive of reliability because I'm sure that I always extract NOT deleted entities, but I'm afraid that this is not efficient.

My question is very simple: is my application extracting all data from the User table (select * from User) each time a specific User is requested, or is Entity Framework smart enough to understand what I want and translates to a SQL query of something like: select * from User where userName = @userName?

If I'm extracting all data from the database each time I need a single user, then how can I change my code so that it creates the correct query? I would like to mantain the centralization, so that I don't have to specify !Deleted in each LINQ query.

Errore Fatale
  • 978
  • 1
  • 9
  • 21

4 Answers4

6

is my application extracting all data from the User table (select * from User) each time a specific User is requested, or Entity Framework is smart enough to understand what I want and the built SQL queries is something like: select * from User where userName = @userName ?

You are retrieving all users that are not deleted each time. The reason is because GetAllUsers returns an IEnumerable<User> instead of an IQueryable<user>, so you have left the world of Linq-to-EF and are now working in Linq-to-objects, which does all filtering, sorting, etc. in memory and does not go back to the database.

The good news is you can easily refactor this without changing the return type to IQueryable and possibly breaking existing code:

// make the base query private   
private IQueryable<User> GetAllUsersQuery()
{
    return _dataContext.Users.Where(element => !element.Deleted);
}

public IEnumerable<User> GetAllUsers()
{
    return GetAllUsersQuery().AsEnumerable();
}

public User GetUserById(string userName)
{
    return GetAllUsersQuery().FirstOrDefault(elt => elt.UserName.Equals(userName));
}

Now GetUserByID just appends a condition onto the original query, so the operations will then be pushed to the database rather than filtering in-memory.

D Stanley
  • 149,601
  • 11
  • 178
  • 240
  • 3
    Breaking existing code is a *good* thing in this case. I doubt the OP wants to load all entries in memory. Given that IQueryable is easily converted to an IEnumerable though, the breaking changes will be very few – Panagiotis Kanavos Sep 25 '15 at 13:08
  • @PanagiotisKanavos Changing the return type to `IQueryable` can have significant unintended side-effects that won't be known until run-time. If a caller tries to attach a condition that cannot be converted to SQL the app will fail at run-time. Keeping the return type as `IEnumerable` will prevent that. Any filtering on top of `IQueryable` should be done in the repository IMHO. – D Stanley Sep 25 '15 at 13:11
  • Thanks for your explanation. This answer is my prefered one, but I have to be sure that it is correct. If you provide a source or more developers confirm what you wrote, I can mark this answer as the correct anwser. – Errore Fatale Sep 25 '15 at 13:13
  • 1
    @ErroreFatale Why not verify it for yourself? Get a EF SQL profiler and see what queries are created in both cases. I have no problem with you being cautions but I'm not sure I would trust three posters on this site any more that I would trust one. – D Stanley Sep 25 '15 at 13:15
  • @ErroreFatale you can also verify it from the answers [here](http://stackoverflow.com/questions/2876616/returning-ienumerablet-vs-iqueryablet) and [here](http://stackoverflow.com/questions/5311034/what-is-the-effect-of-asenumerable-on-a-linq-entity) – D Stanley Sep 25 '15 at 13:18
  • @DStanley silently loading an entire database table in memory is probably *worse* than throwing an exception. You can catch the exception during testing. You can't catch the "load all" bug. In fact, renaming the method to something else would be even better - all affected code would be revealed during compilation – Panagiotis Kanavos Sep 25 '15 at 13:18
  • @PanagiotisKanavos So you're saying a working app that uses more memory than it should is worse than an app that throws an error at runtime? Would you rather have Outlook be a memory hog (which it is) or have it crash ten times a day? If you don't want to allow anyone to load all users into memory then _don't make the method public_. Returning `IQueryable` from a public repository method is an accident waiting to happen. – D Stanley Sep 25 '15 at 13:21
  • I'm trying to understand the critics of PanagiotisKanavos and I would like to understand what would be the correct solution according to him. Yes, sometimes I have to get all users in my business logic and presentation layer. – Errore Fatale Sep 25 '15 at 13:23
  • @ErroreFatale The problem is that you *don't* want to accidentally load everything. Ensuring this doesn't happen isn't hard. If you change your method to return IQueryable and rename it, compilation will break immediately and you will be able to modify all callers so that they call it correctly. If you don't want to expose IQueryable to the callers, create a different method that returns all Users, eg LoadAllUsers that returns an array or List. You can't use the `GetAllUsers` name until you are 100% certain there is no code like the original – Panagiotis Kanavos Sep 25 '15 at 13:36
  • @ErroreFatale another pitfall, is that enumerating an IEnumerable executes it all over again. If you try to enumerate `GetAllUsers` twice, you'll end up executing the same SQL query twice. – Panagiotis Kanavos Sep 25 '15 at 13:37
  • @PanagiotisKanavos That will not change by changing it to `IQueryable`. – D Stanley Sep 25 '15 at 13:55
  • @DStanley that's why I proposed returning arrays or Lists (specifically IList). If you don't want to expose the IQueryable, hide *both* the IQueryable and IEnumerable – Panagiotis Kanavos Sep 25 '15 at 13:55
  • @PanagiotisKanavos All due respect, but if you have a differing view then _add your own answer_. Providing an alternative in a vague collection of commetns is not helpful. `List` vs `IQueryable` vs `IEnumerable` is not a black-and-white issue, and there's already plenty of debate on it out there. – D Stanley Sep 25 '15 at 13:57
  • @PanagiotisKanavos What you wrote is interesting but not so clear. You should create an other anwser and explain your idea with a code snippet. Otherwise, this answer will be chosen as the best one. – Errore Fatale Sep 25 '15 at 15:12
  • I dond't understand if you are simply discussing the possible error of developer of using GetAllUsers() instead of GetAllUsersQuery() to create a higher level method (for example GetUserByUserName). – Errore Fatale Sep 25 '15 at 15:21
1

Yes, Entity Framework creates a SQL statement for your query.

Regarding the deleted rows issue, you could create a SQL view and map your entities to this, rather than the actual table. That way you make it far less likely that you'd select deleted rows.

N.B. You can still insert new rows.

CREATE VIEW UsersActive
AS
SELECT *
FROM Users
WHERE Deleted != 1

And then on your entity:

[Table("UsersActive")]
public class User
{
    ...
}
Knelis
  • 6,782
  • 2
  • 34
  • 54
  • No, EF will only create a query for `GetAllUsers`. Since that method returns an `IEnumerable` the additional filtering in `GetUserById` will be done in-memory. – D Stanley Sep 25 '15 at 13:13
  • I don't think that's correct. `IEnumerable` is just an interface, not a type. Unless you specifically call `.AsEnumerable` or `.ToList` or something like that, the data won't be loaded into memory. – Knelis Sep 25 '15 at 14:11
  • I should have been more clear that EF will only create a SQL query for GetAllUsers. The difference is what extension methods subsequent calls are bound to. `IEnumerable` calls will be bound to `Enumerable` which does in-memory queries, while `IQueryable` calls will be bound to `Queryable` which pushes the queries to the underlying provider (like EF). – D Stanley Sep 25 '15 at 14:17
0

Yes, for every user it will request it from the database,If you want to avoid interacting with the database then you can cache the user table and then you can get the users from that cached object based on username

DSM
  • 46
  • 1
  • 4
0

I would change the GetUserById function to this, to ensure a proper SQL Query:

public User GetUserById(string userName)
{
    return _dataContext.FirstOrDefault(elt => elt.UserName.Equals(userName) && !elt.Deleted);
}

This should generate a query with a proper where clause.

Johnie Karr
  • 2,744
  • 2
  • 35
  • 44
  • 1
    Yes, this is what I'm doing now. What I don't like of this is that it is error-prone: I always have to remember to extract only NOT deleted entities. I would like to have centralized methods which are used by all other methods. – Errore Fatale Sep 25 '15 at 13:02
  • 1
    There's nothing wrong with using multiple `Where` statements, Selects or `FirstOrDefault` - provided all methods return an `IQueryable` instead of `IEnumerable`. In fact, it's a common LINQ use case. The problem with the original code is that it returns an IEnumerable – Panagiotis Kanavos Sep 25 '15 at 13:04