2

I'm having trouble doing case insensitive string comparison using code first against an Oracle db. Code looks something like this;

String filter = "Ali";
var employee = dbContext.Employees.Where(x => x.Name.Contains(filter)).FirstOrDefault();

The code above acts to be case sensitive. So I converted both the Name and filter to Uppercase;

String filter = "Ali";
filter = filter.ToUpper();
var employee = dbContext.Employees.Where(x => x.Name.ToUpper().Contains(filter)).FirstOrDefault();

Everything seemed to work at first, but then I realized it's not working when the employee's name or the filter contains the character 'i'. The problem is how the letter i works in Turkish.

In most languages, 'i' stands for the lowercase, and 'I' stands for the uppercase version of the character. However in Turkish, 'i's uppercase is 'İ', and 'I's lowercase is 'ı'. Which is a problem as Oracle uppercases the letter 'i' in the db as 'I'.

We do not have access to the db's character encoding settings as its effects cannot be foreseen easily.

What I've come up with is this, and it is very ugly.

String filterInvariant = filter.ToUpper(CultureInfo.InvariantCulture);
String filterTurkish = filter.ToUpper(CultureInfo.CreateSpecificCulture("tr-TR"));
var employee = dbContext.Employees.Where(x => x.Name.ToUpper().Contains(filterInvariant) || x.Name.ToUpper().Contains(filterTurkish)).FirstOrDefault();

It seems to fix some of the issues, but feels like a brute force workaround rather than a solid solution. What are the best practices, or alternatives to this workaround, while using Code First C# against an Oracle database?

Thanks in advance

tdgtyugdyugdrugdr
  • 786
  • 4
  • 13
  • 31

3 Answers3

5

Ditch using all the UPPER functions. Simply let Oracle do your language aware case-insensitive matching. This is done by setting your DB connection from C# to have the appropriate language parameters. This setting is just for your DB session, not a global change for the whole DB. I'm no C# wizard, so you'd have to figure out where to make these session settings in your db connection/pool code.

ALTER SESSION SET nls_language=TURKISH;
ALTER SESSION SET nls_comp=LINGUISTIC;
ALTER SESSION SET nls_sort=BINARY_CI;

If C# proves too difficult to find where to change this, you could set this up as a user/schema logon trigger (below), which sets these automatically for you at db connect time (replace SOMEUSER with your actual db username). This only affects any NEW db sessions, so if you have connections pooled, you'll want to cycle the DB connection pool to refresh the connections.

CREATE OR REPLACE TRIGGER SOMEUSER.SET_NLS_CASE_INSENSITIVE_TRG AFTER
                   LOGON ON SOMEUSER.SCHEMA
  BEGIN
    EXECUTE IMMEDIATE 'ALTER SESSION SET nls_language=TURKISH';
    EXECUTE IMMEDIATE 'ALTER SESSION SET nls_comp=LINGUISTIC';
    EXECUTE IMMEDIATE 'ALTER SESSION SET nls_sort=BINARY_CI';
  END;
/

Here's a little test I did in an Oracle DB:

CREATE TABLE mypeople (name VARCHAR2(10 CHAR));

INSERT INTO mypeople VALUES ('Alİ Hassan');
INSERT INTO mypeople VALUES ('AlI Hassan');
INSERT INTO mypeople VALUES ('Ali Hassan');
INSERT INTO mypeople VALUES ('Alı Hassan');

SELECT name FROM mypeople WHERE name LIKE 'Ali%';

NAME
----------
Ali Hassan

ALTER SESSION SET nls_language=TURKISH;
ALTER SESSION SET nls_comp=LINGUISTIC;
ALTER SESSION SET nls_sort=BINARY_CI;

SELECT name FROM mypeople WHERE name LIKE 'Ali%';

NAME
----------
Alİ Hassan
AlI Hassan
Ali Hassan
Joshua Huber
  • 3,443
  • 20
  • 27
1

The implementation of String.Contains is different for different providers, for example Linq2Sql is always case insensitive. The search is case sensitive or not depends on server settings. For example SQL Server by default has SQL_Latin1_General_CP1_CI_AS Collation and that is NOT case sensitive. For Oracle you can change this behavior at the session level: Case insensitive searching in Oracle (Issue a raw SQL query using context.Database.ExecuteSqlCommand method at the beginning of the session)

Community
  • 1
  • 1
VahidN
  • 18,457
  • 8
  • 73
  • 117
1

The problem is in the database, not in .NET, for example this query:

FILES.Where(t => t.FILE_NAME.ToUpper() == "FILE.TXT") // Get rows from file-table

translates into this Oracle SQL with the oracle provider I have:

SELECT t0.BINARY_FILE, t0.FILE_NAME, t0.FILE_SIZE, t0.INFO, t0.UPLOAD_DATE
FROM FILES t0
WHERE (UPPER(t0.FILE_NAME) = :p0)
-- p0 = [FILE.TXT]

The contains with First() becomes this:

SELECT * FROM (SELECT t0.BINARY_FILE, t0.FILE_NAME, t0.FILE_SIZE, t0.INFO, t0.UPLOAD_DATE
            FROM FILES t0
            WHERE ((UPPER(t0.FILE_NAME) LIKE '%' || :p0 || '%') 
                OR (UPPER(t0.FILE_NAME) LIKE '%' || :p1 || '%'))) 
WHERE ROWNUM<=1
-- p0 = [FILE.TXT]
-- p1 = [FİLE.TXT]

So it depends on your database's culture settings, ie without knowing them I would say the "overlap" with your solution is the best way to solve it. Why can't you just check the database culture settings?

flindeberg
  • 4,887
  • 1
  • 24
  • 37