21

I found a Windows API function that performs "natural comparison" of strings. It is defined as follows:

int StrCmpLogicalW(
    LPCWSTR psz1,
    LPCWSTR psz2
);

To use it in Delphi, I declared it this way:

interface
  function StrCmpLogicalW(psz1, psz2: PWideChar): integer; stdcall;

implementation
  function StrCmpLogicalW; external 'shlwapi.dll' name 'StrCmpLogicalW';

Because it compares Unicode strings, I'm not sure how to call it when I want to compare ANSI strings. It seems to be enough to cast strings to WideString and then to PWideChar, however, I have no idea whether this approach is correct:

function AnsiNaturalCompareText(const S1, S2: string): integer;
begin
  Result := StrCmpLogicalW(PWideChar(WideString(S1)), PWideChar(WideString(S2)));
end;

I know very little about character encoding so this is the reason of my question. Is this function OK or should I first convert both the compared strings somehow?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Mariusz Schimke
  • 3,185
  • 8
  • 45
  • 63

4 Answers4

11

Keep in mind that casting a string to a WideString will convert it using default system codepage which may or may not be what you need. Typically, you'd want to use current user's locale.

From WCharFromChar in System.pas:

Result := MultiByteToWideChar(DefaultSystemCodePage, 0, CharSource, SrcBytes,
  WCharDest, DestChars);

You can change DefaultSystemCodePage by calling SetMultiByteConversionCodePage.

NGLN
  • 43,011
  • 8
  • 105
  • 200
gabr
  • 26,580
  • 9
  • 75
  • 141
  • To be honest, I wanted to create a function that would be a counterpart of the AnsiCompareText() (of course this one performs a lexical comparison of ANSI strings). Does it use current user's locale or default system codepage? Thanks. – Mariusz Schimke Jun 22 '09 at 15:11
  • Use the source: Result := CompareString(LOCALE_USER_DEFAULT, NORM_IGNORECASE, PChar(S1), Length(S1), PChar(S2), Length(S2)) - CSTR_EQUAL; And from MSDN: LOCALE_USER_DEFAULT The current user's default locale. So no, it doesn't use current locale but default locale of the current user. Which again may be or may be not what you really want :-/ – gabr Jun 22 '09 at 17:26
5

The easier way to accomplish the task would be to declare your function as:

interface
   function StrCmpLogicalW(const sz1, sz2: WideString): Integer; stdcall;

implementation
   function StrCmpLogicalW; external 'shlwapi.dll' name 'StrCmpLogicalW';

Because a WideString variable is a pointer to a WideChar (in the same way an AnsiString variable is a pointer to an AnsiChar.)

And this way Delphi will automatically "up-convert" an AnsiString to a WideString for you.

Update

And since we're now in the world of UnicodeString, you would make it:

interface
   function StrCmpLogicalW(const sz1, sz2: UnicodeString): Integer; stdcall;

implementation
   function StrCmpLogicalW; external 'shlwapi.dll' name 'StrCmpLogicalW';

Because a UnicodeString variable is still a pointer to a \0\0 terminated string of WideChars. So if you call:

var
    s1, s1: AnsiString;
begin
    s1 := 'Hello';
    s2 := 'world';

    nCompare := StrCmpLogicalW(s1, s2);
end;

When you try to pass an AnsiString into a function that takes a UnicodeString, the compiler will automatically call MultiByteToWideChar for you in the generated code.

CompareString supports numeric sorting in Windows 7

Starting in Windows 7, Microsoft added SORT_DIGITSASNUMBERS to CompareString:

Windows 7: Treat digits as numbers during sorting, for example, sort "2" before "10".

None of this helps answer the actual question, which deals with when you have to convert or cast strings.

Ian Boyd
  • 246,734
  • 253
  • 869
  • 1,219
3

There might be an ANSI variant for your function to (I haven't checked). Most Wide API's are available as an ANSI version too, just change the W suffix to an A, and you're set. Windows does the back-and-forth conversion transparantly for you in that case.

PS: Here's an article describing the lack of StrCmpLogicalA : http://blogs.msdn.com/joshpoley/archive/2008/04/28/strcmplogicala.aspx

PatrickvL
  • 4,104
  • 2
  • 29
  • 45
  • Any `A` version of any Windows API simply will convert the "`AnsiString`" to a "`WideString`" for you, then call the "Wide" API. The author should have just performed the same `MultiByteToWideChar`. But in Delphi it's simple enough to cast a `string` to a `WideString` and have the compiler call it for you. – Ian Boyd May 03 '12 at 18:25
2

Use System.StringToOleStr, which is a handy wrapper around MultiByteToWideChar, see Gabr's answer:

function AnsiNaturalCompareText(const S1, S2: string): integer;   
var
  W1: PWideChar;
  W2: PWideChar;
begin
  W1 := StringToOleStr(S1);
  W2 := StringToOleStr(S2);
  Result := StrCmpLogicalW(W1, W2);
  SysFreeString(W1);
  SysFreeString(W2);
end;

But then, Ian Boyd's solution looks and is much nicer!

Community
  • 1
  • 1
NGLN
  • 43,011
  • 8
  • 105
  • 200
  • 1
    Be aware that you just leaked two wide strings there. `StringToOleStr` returns a `PWideChar`, which you need to free with a call to [`CoTaskMemFree`](http://msdn.microsoft.com/en-us/library/windows/desktop/ms680722(v=vs.85).aspx) (or one of its [legacy](http://blogs.msdn.com/b/oldnewthing/archive/2004/07/05/173226.aspx) [moral equivalents](http://msdn.microsoft.com/en-us/library/windows/desktop/ms678425(v=vs.85).aspx)) – Ian Boyd May 03 '12 at 18:29
  • 1
    Oh, and someone (with enough patience to create a BDN login) should probably edit edit the doc wiki entry for [`StringToOleStr`](http://docwiki.embarcadero.com/Libraries/en/System.StringToOleStr) to say something like *"**Remarks:** You can free strings created with `StringToOleStr` using [`SysFreeString`](http://msdn.microsoft.com/en-us/library/windows/desktop/ms221458(v=vs.85).aspx)"*. That way the documentation documents the correct way to use an API. – Ian Boyd May 04 '12 at 13:33
  • 1
    Sorry, but am I missing something here? Aren't the W1 and W2 variables supposed to be passed to the StrCmpLogicalW function, rather than repeating the string conversions in the function call? – Neville Cook Mar 28 '13 at 23:05