I've stumbled across some interesting behaviour when working with some data from China. For whatever reason, they have records where a traditional open bracket (U+0028) is mixed up with a "full width" open bracket (U+FF08). Now, C# knows they are not equal by testing:
char a = Convert.ToChar(0x0028);
char b = Convert.ToChar(0xFF08);
string g = Convert.ToString(Convert.ToChar(0x0028));
string h = Convert.ToString(Convert.ToChar(0xFF08));
if (a == b) throw new Exception();
if (g == h) throw new Exception();
Excel also knows they are not equivalent, but SQL 2008 thinks they are equivalent:
--should be false and return nothing
select 1 where N'(' = N'('
--is false and returns nothing
select 1 where N'租' = N'('
--should be false and return nothing
select 1 where NCHAR(0x0028) = NCHAR(0xFF08)
So, I think I'm missing something here, but is there a Unicode specific equality operation that would either explicitly support this equality (as opposed to logically being not equal). I'd be interested too as to how other languages treat this behaviour, as I'm using Qlik with this data, and Qlik's internal engine recognizes the two strings as being equivalent when searching, but not when loading data (yay consistency!)
I'd also guess the same behaviour exists for the closing brackets, and other punctuation, but if there's a specific SQL equality operator (or in other languages), that'd solve this problem regardless.
There is a full list of punctuation characters in Wikipedia here, and a similar issue of accented character equality
From @ZLK, the point with SQL is collation. Using code from here SQL Server 2008 and Unicode Character comparison looks like all _BIN, _BIN2 and _WS (Width-Sensitive!) collations work as expected.