Background
Added Later
I have made a pure Pascal function to find the position of a character in a Unicode string as follows:
function CharPosEx(const chChr: Char; const sStr: string;
const iOffset: Integer=1): Integer;
var
PStr : PChar;
PRunIdx: PChar;
PEndIdx: PChar;
iLenStr: Integer;
begin
Result := 0;
iLenStr := Length(sStr);
if (iLenStr = 0) or (iOffset <= 0) or (iOffset > iLenStr) then Exit;
PStr := Pointer(sStr);
PEndIdx := @PStr[iLenStr - 1];
PRunIdx := @PStr[iOffset - 1];
repeat
if PRunIdx^ = chChr then begin
Result := PRunIdx - PStr + 1;
Exit;
end;
Inc(PRunIdx);
until PRunIdx > PEndIdx;
end;
I decide to not use the built-in StrUtils.PosEx()
because I want to create a UTF16_CharPosEx
function based on an optimized pure Pascal function of CharPosEx
. I'm trying to find a faster generic solution like the pure Pascal approachs of the Fastcode Project.
The Original Statements
According to the accepted answer to the question, Delphi: fast Pos with 64-bit, the fastest pure Pascal function to find the position of a substring in a string is PosEx_Sha_Pas_2()
of the Fastcode Project.
For the fastest pure Pascal function to find the position of a character in a string, I noticed that the Fastcode Project has CharPos()
, CharPosIEx()
, and CharPosEY()
for a left-to-right matching, as well as CharPosRev()
for a right-to-left matching.
However, the problem is that all Fastcode functions were developed before Delphi 2009, which was the first Delphi release that supports Unicode.
I'm interested in CharPos()
, and CharPosEY()
. I want to re-benchmark them because there are some optimization techniques that are useless nowadays, such as loop unrolling technique that was occasionally implemented in Fastcode functions.
However, I cannot recompile the benchmark project for each of the CharPos
family challenges because I have been using Delphi XE3 here, therefore I cannot conclude which one is the fastest.
Questions
Anyone here know or can conlude which one is the fastest pure Pascal implementations for each of the mentioned Fastcode challenges, especially for CharPos()
and CharPosEY()
?
Other approaches out of the Fastcode Project solution are welcome.
Notes
- The Unicode string term I used here refers to a string whose the type is
UnicodeString
regardless its encoding scheme. - If encoding scheme matters, what I mean is the fixed-width 16-bit encoding scheme (UCS-2).