2

I have a piece of code which works in EF Core 2.2 used to compare string casing as shown below.

public async Task<bool> DoesItemNumberExists(Guid revisionId, string itemNumber)
{
    var doesExist = await _repository.AnyAsync(a => string.Equals(a.ItemNo, itemNumber, StringComparison.Ordinal) && a.SoqHeading_NP.SoqRevisionId == revisionId);

    return doesExist;
}

I run the same code in EF Core 5 and the application crashes. Any help?

Below is the exception i get

The LINQ expression 'DbSet<SoqItem>()
    .Where(s => s.IsDeleted == False)
    .Join(
        inner: DbSet<SoqHeading>()
            .Where(s0 => s0.SoqRevisionId == __ef_filter__RevisionId_0 && s0.IsDeleted == False), 
        outerKeySelector: s => EF.Property<Nullable<Guid>>(s, "SoqHeadingId"), 
        innerKeySelector: s0 => EF.Property<Nullable<Guid>>(s0, "Id"), 
        resultSelector: (o, i) => new TransparentIdentifier<SoqItem, SoqHeading>(
            Outer = o, 
            Inner = i
        ))
    .Any(s => string.Equals(
        a: s.Outer.ItemNo, 
        b: __itemNumber_0, 
        comparisonType: Ordinal) && s.Inner.SoqRevisionId == __revisionId_1)' could not be translated. Additional information: Translation of the 'string.Equals' overload with a 'StringComparison' parameter is not supported. See https://go.microsoft.com/fwlink/?linkid=2129535 for more information. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'. See https://go.microsoft.com/fwlink/?linkid=2101038 for more information.
thanzeel
  • 431
  • 4
  • 12
  • 2
    What is the exception that caused the application to crash? – RB. Oct 13 '21 at 11:35
  • As I remember, EF Core 3+ must be able to convert LINQ statement into SQL query without additional logic in the app, so the main reason it can crash is that your LINQ can't be converted. I think you need `a.ItemNo.ToLower() == itemNumber.ToLower()` for case insensitive comparoison or `a.ItemNo == itemNumber` for case sensitive. – Ivan Khorin Oct 13 '21 at 11:38
  • 4
    Learning point: please never again, ever post a question on SO that states, implies or even hints that you're getting an exception, without including the full, exact exception type and message. I genuinely cannot believe the number of people I see every day who say "I got an error" and yet don't mention anything more about it (like it's some irrelevant, inconsequential thing/that we'll instantly know exactly what it is anyway); it's the single most useful piece of information in debugging that we can have – Caius Jard Oct 13 '21 at 11:42
  • @RB. i get "{Name = "InvalidOperationException" FullName = "System.InvalidOperationException"} " – thanzeel Oct 13 '21 at 11:44
  • And the Message? It probably talks about the query not being able to be translated.. In the old days, when EF encountered some code that couldnt be converted to SQL and sent to the server for execution on the server (always a good idea) it just automatically downloaded all the data into the client and ran the query there (a terrible idea). These days, when you write something that has no SQL equivalent, you get an error – Caius Jard Oct 13 '21 at 11:44
  • hi all, i have edited the question with the exception. please help – thanzeel Oct 13 '21 at 11:47
  • 1
    Please read the error message? It's one of the better ones.. Let us know what you didn't understand about it so we can guide your learning - if we just say "do this and it will fix it" then you won't learn so much – Caius Jard Oct 13 '21 at 11:48
  • For reference the error message is: *could not be translated. Additional information: Translation of the 'string.Equals' overload with a 'StringComparison' parameter is not supported. See https://go.microsoft.com/fwlink/?linkid=2129535 for more information. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'. See https://go.microsoft.com/fwlink/?linkid=2101038 for more information* – Caius Jard Oct 13 '21 at 11:48
  • FYI case sensitivity with EF is going to be determined by the DB. If you're using SQL Server that typically defaults to a case insensitive collation. So before you bother trying to control how the comparison is done in EF check to see how it's setup in the DB. What you might need to do is allow EF to translate a normal compare with `==` which likely will be case insensitive, then do another `Where` on the client side after a `AsEnumerable` that will be case sensitive. – juharr Oct 13 '21 at 11:52
  • 1
    @thanzeel that's not an EF Core 5 bug. The query always failed to get translated to SQL but EF Core 2 covered this up by loading *everything* in memory then matching the records on the client without the benefit of indexing. EF Core *3* disabled this extremely unfortunate behavior. If you want to avoid a huge performance hit *don't* try to force the case match. Make sure the column's collation matches what you want and create an index on it. Use just `==` in the query – Panagiotis Kanavos Oct 13 '21 at 12:26
  • 1
    @thanzeel If you want both case-sensitive and *in*sensitive matching for different queries you'll have to create a different column (possibly computed) with a different collation and index that as well. Indexes are affected by collations because that clearly affects both equality and the order of the column's values. – Panagiotis Kanavos Oct 13 '21 at 12:29
  • 1
    @thanzeel the bug you already had but didn't realize caused the application to load all rows in memory and check every single one even if only 1 row matched. If the table had 1M rows, you'd load 1M rows in memory then check every single one of them. 1M comparisons to get just 1 row isn't great.. – Panagiotis Kanavos Oct 13 '21 at 12:39
  • @PanagiotisKanavos, thanks fir sharing great knowledge on this. i was able to implement what you guided us with and now its working. thanks a lot again!!! – thanzeel Nov 30 '21 at 03:49
  • @CaiusJard, i was able to understand the error message after some time. it took a while for me to digest what the error said. thanks for the support! – thanzeel Nov 30 '21 at 03:51

3 Answers3

3

Case-sensitivity and collations are explained in the docs, in the docs, in Collations and Case Sensitivity


That's not an EF Core 5 bug. The query always failed to get translated to SQL but EF Core 2 covered this up by loading everything in memory then matching the records on the client without the benefit of indexing. LINQ translation in the first versions of EF Core was so limited that even GROUP BY couldn't be translated. Entity Framework would throw in such cases. To avoid breaking code that worked perfectly in EF 6 though, EF Core 1 and 2 used client-side evaluation: They translated what they could to SQL then loaded the data in-memory, on the client, and execute the rest of the query using LINQ to Objects.

This means that if you wanted to calculate a SUM for 100K rows, EF Core 1-2 would load all 100K rows in memory and proceed to add the values one by one. Never mind joining two tables with 1000 rows each - that's 1M comparisons.

Even in EF Core 2.1 though, client-side evaluation would generate runtime warnings and could be disabled completely. In EF Core 3.1 client-side evaluation was disabled completely.

To get your query to work properly don't try to force the case or the collation. Just use a simple equality :

var itemExists=context.Products.Any(a=>a.ItemNumber == itemNumber && 
                                       a.SoqHeading_NP.SoqRevisionId == revisionId);

This will be translated to WHERE ItemNumber=@itemNumber && SoqHeading_NP.SoqRevisionId = @revisionId. The query will use any indexes that cover the ItemNumber and SoqRevisionId columns to produce a result as fast as possible.

The collation used for the equality match is the column's collation. If that is case sensitive, you get case-sensitive matching. If not, you get case-insensitive matching. Indexes are built using the column's collation, so if you try to use a different collation for matching you'll prevent the server from using any indexes.

If you want to use different case matching in different queries and still use indexes, you need to create different indexes for each case. How you do that depends on the database

  • In SQL Server, case-insensitive is the most common option. To use both that and case-sensitive search, you can create an index a computed column with a binary (hence case-sensitive) collation, eg:
alter table Table1 add ItemNumberCS as COLLATE ..._BIN;
create index IX_Table1_ItemNumberCS on Table1 (ItemNumberCS);

Case-sensitive queries should use the ItemNumberCS column.

  • In PostgreSQL all collations are case-sensitive. Since v12 though, you can create a custom collation and use it in a computed index expression. To use a case-insensitive search, you can create a case-insensitive collation and index eg:
CREATE COLLATION case_insensitive (
      provider = icu,
      locale = 'und-u-ks-level2',
      deterministic = false
);

CREATE INDEX IX_Table1_ItemNumberCI ON Table1 (title COLLATE "case_insensitive");`

The LINQ query won't have to change.

Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
  • so i can do this i guess 'builder.Property(p => p.ItemNo).UseCollation().IsRequired();'. but i dont know what goes to the UseCollation() constructoer. please help – thanzeel Oct 14 '21 at 11:41
  • i used this and it worked. var doesExist = await _repository.AnyAsync(a => EF.Functions.Collate(a.ItemNo, "SQL_Latin1_General_CP1_CS_AS") == itemNumber && a.SoqHeading_NP.SoqRevisionId == revisionId); . does this have a perfomance impact? – thanzeel Oct 14 '21 at 11:49
  • 1
    @thanzeel that's exactly what you **shouldn't** do. The query won't be able to use any indexes and end up scanning the entire table – Panagiotis Kanavos Oct 14 '21 at 11:50
  • i can i achieve this with indexing? – thanzeel Oct 14 '21 at 11:51
  • please help us!! – thanzeel Oct 14 '21 at 12:00
  • I already wrote all this as an answer. You'll find the same things in the docs, in [Collations and Case Sensitivity](https://learn.microsoft.com/en-us/ef/core/miscellaneous/collations-and-case-sensitivity) – Panagiotis Kanavos Oct 14 '21 at 12:04
  • is there a way to write a single `(x => x.a == a)` that will work for Oracle and Sql Server? Like `(x => x.a.ToLower() == a.ToLower())` will wrap column and value into `Lower(..)` but this will hurt index search – T.S. Jul 15 '22 at 19:29
  • 1
    @T.S. This has nothing to do with Oracle vs SQL Server, it has to do with case-sensitive vs case-insensitive collations. If you want case-insensitive searches in Oracle you need to use a case-insensitive collation. Oracle [allows creating collation-specific indexes](https://blogs.oracle.com/sql/post/how-to-do-case-insensitive-and-accent-insensitive-search-in-oracle-database#index). You can create a case-insensitive index since 12.2 with `create index athlete_name_ci on olym_athletes ( athlete_name collate binary_ci );` ` – Panagiotis Kanavos Jul 18 '22 at 07:45
  • @PanagiotisKanavos Thanks. But thought from a different view. When people install oracle they use it as default case-sensitive scheme. Others use MySql or SqlServer. We have 1 code base for all clients. But we have been using wrapper that turns query `x = :1` into `upper(x) = upper(:1)` for oracle. Its bad for index search but this is a way to have 1 codebase for all 3 DBs. So I was wondering if there is equivalent in code, **without** DB changes. Because seriously, we don't even control what customers do in their DB. We just say that our basis is "this" – T.S. Jul 18 '22 at 20:39
  • It's bad, period. It's not just "bad for index search", it forces a full table scan. This means each query will have to search every single row in a table, whether it's 100K or 1M rows. Taking locks on all of them. That's a good way to slow an entire application to a standstill with just a couple of users – Panagiotis Kanavos Jul 18 '22 at 20:44
  • Besides, Oracle databases have DBAs and they would *never* allow an application that performs full table scans to be deployed to production. – Panagiotis Kanavos Jul 18 '22 at 20:47
  • @PanagiotisKanavos I understand. However... Lucky for us, in this product the search is on the tables with a range of 10 records to 20K records. And also, this is a rare thing. So, are you saying that EF in code, there is no way, only DB manipulation? – T.S. Jul 18 '22 at 22:09
2

Because StringComparison.Ordinal statement can't be translate to SQL query.

You should read data without StringComparison.Ordinal, and when data read from SQL and come to application memory then you can use StringComparison.Ordinal.

public async Task<bool> DoesItemNumberExists(Guid revisionId, string itemNumber)
{
    var selectedRows = await _dbContext.YourTable.Where(a => a.ItemNo == itemNumber  && a.SoqHeading_NP.SoqRevisionId == revisionId).ToListAsync();
    return selectedRows.Any(a =>  string.Equals(a.ItemNo, itemNumber, StringComparison.Ordinal));
}

Microsoft reference:

Before 3.0, when EF Core couldn't convert an expression that was part of a query to either SQL or a parameter, it automatically evaluated the expression on the client. By default, client evaluation of potentially expensive expressions only triggered a warning.

New behavior Starting with 3.0, EF Core only allows expressions in the top-level projection (the last Select() call in the query) to be evaluated on the client. When expressions in any other part of the query can't be converted to either SQL or a parameter, an exception is thrown.

Why?

Automatic client evaluation of queries allows many queries to be executed even if important parts of them can't be translated. This behavior can result in unexpected and potentially damaging behavior that may only become evident in production. For example, a condition in a Where() call which can't be translated can cause all rows from the table to be transferred from the database server, and the filter to be applied on the client. This situation can easily go undetected if the table contains only a few rows in development, but hit hard when the application moves to production, where the table may contain millions of rows. Client evaluation warnings also proved too easy to ignore during development.

Besides this, automatic client evaluation can lead to issues in which improving query translation for specific expressions caused unintended breaking changes between releases.

Farhad Zamani
  • 5,381
  • 2
  • 16
  • 41
0

TLDR: The important part is to compare two strings with EF.Functions.Collate(string1, "SQL_Latin1_General_CP1_CS_AS") == string2


As mentionned above by @Panagiotis Kanavos, all relevant info about case-sensitivity and collate are found in Collations and case sensitivity - EF Core | Microsoft Docs

A quick solution that worked for me with minimal code refactoring is with using an explicit collation query and do the following:

public async Task<bool> DoesItemNumberExists(Guid revisionId, string itemNumber) {
    var doesExist = await _repository.AnyAsync(a => EF.Functions.Collate(a.ItemNo, "SQL_Latin1_General_CP1_CS_AS") == itemNumber && a.SoqHeading_NP.SoqRevisionId == revisionId);

    return doesExist; 
}
Codingwiz
  • 192
  • 2
  • 14
  • That's a very bad idea, as bad as the answers that use `ToUpper()`, because the fields will be compared without using any indexes. The query will have to perform a full table scan – Panagiotis Kanavos Jul 18 '22 at 20:48
  • The problem is my Database(DB) collation was set as case insensitive "Latin1_General_CI_AS" and I don't have access to it. So the only solution was for me to either load the data on memory and compare it that way or directly compare with query in DB using `EF.Functions.Collate`. Since the query is more efficient than loading in memory, I think it is a valid solution to the problem at hand. – Codingwiz Jul 19 '22 at 14:49
  • That still requires a full table scan and take Shared locks on all rows while it's searching. Connections that need to update rows will be blocked until the first query/transaction finishes. If you need to do this frequently one solution is to create a persisted computed column with the `COLLATE` clause, effectively generating another column with the CS collation. This can be indexed resulting in fast queries – Panagiotis Kanavos Jul 19 '22 at 15:16
  • Honestly, I have little knowledge about collation and I don't really understand your implementation. Is it part of your solution or can you add to it if not ? The only issue is I cannot modify the existing DB because it is used by another app. Also note that I am using EF Core 6 and my tables and all their relations are generated by it. – Codingwiz Jul 19 '22 at 15:45