0

I can search customers table without any problems by SQL.

select Name
from Customers
where  Name COLLATE Latin1_General_CI_AI like '%ozgur%'

This query can find "özgür"

When I place this table to cache and try to search this table with linq, I can't find "özgür" by "ozgur" search word.

Is there any similar way to use Latin1_General_CI_AI in C# LINQ?

Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
emert117
  • 1,268
  • 2
  • 20
  • 38
  • Which dbms are you using? – jarlh Nov 29 '18 at 10:46
  • MS SQL Server 13.0 – emert117 Nov 29 '18 at 10:47
  • LINQ isn't SQL. It's a query language that sits on top of an ORM. Collation is something that should be configured on the *entity properties* if possible – Panagiotis Kanavos Nov 29 '18 at 10:47
  • @emert117 there's no such version. What ORM are you using? NHibernate? Entity Framework? EF Core? All of them work with LINQ. How did you configure the context and entities? – Panagiotis Kanavos Nov 29 '18 at 10:48
  • @PanagiotisKanavos I am using Entity Framework. – emert117 Nov 29 '18 at 10:50
  • @emert117 why not modify the column's collation if you want it to work in an accent-insensitive way? Who specified `_AS` anyway? If the current collation is accent insensitive you don't need to change anything – Panagiotis Kanavos Nov 29 '18 at 10:50
  • @PanagiotisKanavos I can't modify column's collation. DB is large and always in use. – emert117 Nov 29 '18 at 10:51
  • @emert117 what is the current collation? Is it really accent sensitive (_AS)? Do you really need that collation statement? – Panagiotis Kanavos Nov 29 '18 at 10:52
  • @PanagiotisKanavos column's collation is Turkish_CI_AS . If I change column's collation, will linq work the way I want? (I will create a temp table and try that.) – emert117 Nov 29 '18 at 10:56
  • 1
    Not with collation, see here: https://stackoverflow.com/questions/29723053/using-collation-in-linq-to-sql But it should work with a `string.Compare`, see here: https://stackoverflow.com/questions/30051621/string-compare-with-special-characters-in-c-sharp – LocEngineer Nov 29 '18 at 11:03
  • @LocEngineer I already read these questions. I created a temp table with Latin1_General_CI_AI collation. But Linq didn't work the way I want again :( – emert117 Nov 29 '18 at 11:24
  • Then you already know the answer: No. Not with COLLATE. Only if you put that in, say, a stored procedure with the search value as a parameter. Or use a custom string comparer with CultureInfo. – LocEngineer Nov 29 '18 at 11:28
  • @LocEngineer the OP asks for the correct thing. You *can't* use String.Compare in SQL. And yes, `COLLATE` is the correct option with SQL. No, LINQ to SQL didn't work. It loaded everything in memory then tried to filter the data *locally*. – Panagiotis Kanavos Nov 29 '18 at 12:02
  • @emert117 the only place I've found that uses a collation is [Entity SQL's ORDER BY](https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/ef/language-reference/order-by-entity-sql). You could [use raw SQL with SqlQuery](http://www.entityframeworktutorial.net/EntityFramework4.3/raw-sql-query-in-entity-framework.aspx) BUT!!! your query's performance will be bad. `LIKE '%...%'` has to scan the entire table without benefit of indexing. Even `WHERE Name COLLATE Latin1_General_CI_AI = '%ozgur%'` would have to perform an index scan as indexes are built using specific collations – Panagiotis Kanavos Nov 29 '18 at 12:06
  • @emert117 have you considered using a [full text search index](https://learn.microsoft.com/en-us/sql/relational-databases/search/full-text-search?view=sql-server-2017) to speed up searching for words anywhere in a field? With an FTS index you could write `WHERE CONTAINS(Name ,@thatName)` – Panagiotis Kanavos Nov 29 '18 at 12:08

3 Answers3

2

The only place I've found that uses a collation is Entity SQL's ORDER BY clause.

You could use SqlQuery as shown here to use a SQL string (with parameters of course) that uses the COLLATE clause :

var query = "select Name from Customers " +
           " where  Name COLLATE Turkish_CI_AI like @name";
var results = myContext.Customers
                       .SqlQuery(query,new SqlParameter("@name","%ozgur%"))
                       .ToList();

I'd advise caution though. LIKE '%...%' can't benefit from any indexes that cover the name field and will have to search the entire table. Even Name = @name COLLATE ... may not use any indexes the collation doesn't match the collation the index was built with.

You should consider using full text search indexes and make full text search queries for specific words, eg:

SELECT Name from Customers WHERE CONTAINS(Name ,@thatName)

Update

Another option is to use an interceptor to change the SQL generated by a clause, as shown in this SO question. That interceptor uses a regular expression to replace LIKE with CONTAINS. A simpler expression could be used to inject the COLLATE clause before LIKE

The code isn't trivial, but at least it's an option.

Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
  • I have to use `LIKE '%...%'` at SQL DB side . I don't want to run a query on database server. I will try t o write a C# function for calculation of string similarity. If I can't find a good solution at application server side, I will use your advice for a SQL solution .Thanks for your answer. – emert117 Nov 29 '18 at 12:37
  • 1
    @emert117 you *are* running a query on the database in any case. LINQ to EF generates a SQL query that will run on the database. When you write `ctx.Customers.Select(c=>c.Name).ToList()` EF will execute `select name from customers` on the database. What you *don't* want to do is load every customer name from the database and try to filter them on the client. This may work for a few thousand records but not hundreds of thousands or millions of records – Panagiotis Kanavos Nov 29 '18 at 12:40
  • Sorry, I mentioned wrong. I use EF while putting data to cache. I want to search cache without running query on DB. – emert117 Nov 29 '18 at 13:10
0

Why don't you use following filer clause for unicode values

Converting the collation type will cause performance issues and prevent index usage

select Name
from Customers
where Name like N'%özgür%'
Eralper
  • 6,461
  • 2
  • 21
  • 27
  • I must use application server for search. I don't want to use database for this. So I can't use Full Text Search and SQLquery. I must solve this problem with C#. – emert117 Nov 29 '18 at 12:17
  • @Eralper although forgetting that `N` can cause problems, it won't fix *this* problem. If the *field's* collation is accent sensitive, the server won't match strings that have different accents, whether a Unicode literal is used or not. – Panagiotis Kanavos Nov 29 '18 at 12:27
  • @emert117 when you use EF you use the database, not an application server – Panagiotis Kanavos Nov 29 '18 at 12:27
0

You can solve this by using a normal query (via context.Database.SqlQuery) that returns an array of ids and use this ids in your linq statement.