During profiling I came across a function that was taking quite a bit of time, but essentially boiled down to this very simple piece of code:
function GetSubstring(AInput: PChar; AStart, ASubstringLength: Integer): string;
begin
Result := Copy(AInput, AStart, ASubstringLength);
end;
This function returns the expected sub-string, but it doesn't scale very well for longer inputs. I looked at the assembler code in CPU view, and from what I can tell (I'm not normally working at the assembler level), it seems that AInput
is implicitly converted into a string before calling Copy
.
But since at this point the length of the string/character array is unknown, the conversion code has to walk the length of the PChar
until it finds the null terminator. This would explain the terrible scaling for longer inputs.
However, since the caller passes in the length of the PChar
, I initially thought could just convert the method to use SetString
instead.
function GetSubstring(AInput: PChar; AStart, ASubstringLength: Integer): string;
begin
SetString(Result, AInput + AStart - 1, ASubstringLength);
end;
In addition to SetString
working zero-based (not one-based as Copy), there seem to be a number of other little things Copy
does in terms of validating its inputs, not all of which are documented (e.g. any start value less than 1 is changed to 1). So the naïve implementation above doesn't always work as the original one.
My goal is to replicate the Copy
routine as much as possible, as this function is part of a library and has been used extensively by my colleagues.
I'm wondering whether the following implementation accomplishes that, or whether I need to be aware of any other caveats of Copy
. NB: FLength
is the actual length of AInput
that comes from another part in the module this function is part of. I removed that other part for this example.
function GetSubstring(AInput: PChar; AStart, ASubstringLength: Integer): string;
begin
if (AInput = nil) then begin
Result := '';
end else begin
if (AStart < 1) then begin
AStart := 0;
end else begin
AStart := AStart - 1;
end;
if (ASubstringLength + AStart > FLength) then begin
ASubstringLength := FLength - AStart;
end;
SetString(Result, AInput + AStart, ASubstringLength);
end;
end;
I'm using Delphi 2006, but I assume this isn't much different in other versions of the product (at least the non-Unicode ones).