It seems there is no Ansi overload for StrToInt. Is this right? Or maybe I am missing something. StrToInt insists to convert my ansistrings to string.
-
You could try the FastCoder's asm implementation. See http://fastcode.sourceforge.net/challenge_content/rtl_replcmnt_pkg.html and the FastCodeStrToInt32Unit.pas file. Change String type to AnsiString. – LU RD Jun 16 '14 at 17:28
-
shlwapi.StrToIntA does not seem to call StrToIntW or StrToIntExW. Can't know about the performance though.. – Sertac Akyuz Jun 16 '14 at 20:29
3 Answers
You are correct. There is no ANSI version of StrToInt
. The place to find ANSI versions of standard function is the AnsiStrings
unit, and there's nothing there.
Either write your own function to do the job, or accept the conversion required to use StrToInt
.
It's not too hard to write your own function. It might look like this:
uses
SysConst; // for SInvalidInteger
....
{$OVERFLOWCHECKS OFF}
{$RANGECHECKS OFF}
function AnsiStrToInt(const s: AnsiString): Integer;
procedure Error;
begin
raise EConvertError.CreateResFmt(@SInvalidInteger, [s]);
end;
var
Index, Len, Digit: Integer;
Negative: Boolean;
begin
Index := 1;
Result := 0;
Negative := False;
Len := Length(s);
while (Index <= Len) and (s[Index] = ' ') do
inc(Index);
if Index > Len then
Error;
case s[Index] of
'-','+':
begin
Negative := s[Index] = '-';
inc(Index);
if Index > Len then
Error;
end;
end;
while Index <= Len do
begin
Digit := ord(s[Index]) - ord('0');
if (Digit < 0) or (Digit > 9) then
Error;
Result := Result * 10 + Digit;
if Result < 0 then
Error;
inc(Index);
end;
if Negative then
Result := -Result;
end;
This is a cut-down version of that found in StrToInt
. It does not handle hexadecimal and is a bit more stringent regarding errors. Before using this code I'd want to test whether or not this really is your bottleneck.
It is quite interesting that this code, based on that in the RTL source, is incapable of returning low(Integer)
. It's not too hard to fix that up, but it would make the code more complex.

- 601,492
- 42
- 1,072
- 1,490
-
Damn it! I work with LARGE files. They are ANSI so I wanted to have my whole program work in 'ANSI mode' so save RAM (and CPU). The damn function will ruin everything. I could write my own function but it may not be as fast as Embarcadero's so I better stick with StrToInt then. – Gabriel Jun 16 '14 at 13:12
-
A call to StrToInt, converting one short string to UTF16 won't hurt. Where do LARGE files come into this? You cannot avoid UTF16 in modern Delphi. The RTL and VCL all use UTF16. What's more, what do you think happens when you pass an ANSI string to an ANSI Win32 function? Yup, it gets converted to UTF16. The overhead of a short lived temp string is not severe. – David Heffernan Jun 16 '14 at 13:15
-
I have 20GB+ ASCII text files with billions of entries. Converting up and down from/to string/ansistring will ruin performance. – Gabriel Jun 16 '14 at 13:17
-
Also, if you want to write your own converter then that is quite easy. Use the code from pre Unicode Delphi. And really, is the perf of your program determined by calls to StrToInt? Seems unlikely. – David Heffernan Jun 16 '14 at 13:18
-
4Will perf be that bad. Surely your program is disk bound rather than CPU bound. Did you measure? – David Heffernan Jun 16 '14 at 13:19
-
The program is doing complex computations on each text entry. So the disk is not the biggest problem. IF I am parsing the file without computations, then yes, the disk is the bottleneck, but when I enable the computations for each entry then everything depends on CPU. – Gabriel Jun 16 '14 at 13:27
-
3@Altar: But you're only doing the StrToInt conversion once, right? So it's not the bottleneck. Disk I/O is going to be a much greater bottleneck than your computations. – Ken White Jun 16 '14 at 13:44
-
One row has multiple numbers separated by ':'. I need to extract the numbers. Please remember that the file can have (fortunately, not always) up to 6.3 billion entries! So, I have to apply StrToInt multiple times on each entry. This is why I think I will stick with the ASM version of StrToInt (and accept the conversion from Ansi to String that it does). – Gabriel Jun 16 '14 at 13:50
-
1Did you actually try to use a function like David wrote (I assume hex is not important for you, anway?) and actually profile which was faster, or if and where there is a bottleneck? If your computations are so complicated that disk I/O is not the bottleneck anymore, ISTM that converting the string may not be a bottleneck either. – Rudy Velthuis Jun 16 '14 at 13:56
-
2@Altar If you really do have a bottleneck here then I am prepared to bet that it will be on the heap allocations. Splitting lines on separators into separate values causing a boat load of heap allocations. They will dominate perf by at least an order of magnitude over the text to int conversion. I do have a strong suspicion that you are not on top of the true perf bottleneck. – David Heffernan Jun 16 '14 at 14:02
-
4What you really need here is a function that takes a line as input and returns an array of values. Better still, you pass in a pre-allocated array, preferably on the stack if possible, and the function splits and populates in one go, without touching the heap. Remember that the heap is your friend when you want convenience, but your mortal enemy when you care about perf. – David Heffernan Jun 16 '14 at 14:04
-
It should be easy to set up a unit perf test in a loop to generate strings and run them through the new vs. old function and see exactly how much this affects performance. – Chris Thornton Jun 16 '14 at 14:06
-
> takes a line as input and returns an array of values - Unfortunately, I also have strings. The format is like: @xxx:77:88:99 xxx:9:0. But your idea is really good. I will try to see what I can do about it. Thanks a lot!!!!!!!!!! – Gabriel Jun 16 '14 at 14:07
-
1If they are of that rigid form then you get the function to return a record with all the values in. Avoid heap allocation. But always, time, time, and time again. That's the most important. – David Heffernan Jun 16 '14 at 14:13
-
I created a new topic about format of the line: http://stackoverflow.com/questions/24245758/how-to-quickly-parse-an-ansi-atring – Gabriel Jun 16 '14 at 14:16
The code is actually very simple (hex strings aren't supported but prolly you don't need them):
function AnsiStrToInt(const S: RawByteString): Integer;
var
P: PByte;
Negative: Boolean;
Digit: Integer;
begin
P:= Pointer(S);
// skip leading spaces
while (P^ = Ord(' ')) do Inc(P);
Negative:= False;
if (P^ = Ord('-')) then begin
Negative:= True;
Inc(P);
end
else if (P^ = Ord('+')) then Inc(P);
if P^ = 0 then
raise Exception.Create('No data');
Result:= 0;
repeat
if Cardinal(Result) > Cardinal(High(Result) div 10) then
raise Exception.Create('Integer overflow');
Digit:= P^ - Ord('0');
if (Digit < 0) or (Digit > 9) then
raise Exception.Create('Invalid char');
Result:= Result * 10 + Digit;
if (Result < 0) then begin
if not Negative or (Cardinal(Result) <> Cardinal(Low(Result))) then
raise Exception.Create('Integer overflow');
end;
Inc(P);
until (P^ = 0);
if Negative then Result:= -Result;
end;

- 27,213
- 5
- 67
- 118
-
`Tmp < 0` is probably more efficient. Or assign `Result * 10 + Digit` directly to `Result` and check for `< 0`. Interestingly the entire approach means that the function is incapable of returning `low(Integer)`. I guess it would need to use different branches for +ve and -ve to be able to return `low(Integer)`. – David Heffernan Jun 16 '14 at 15:42
-
Nah, actually the second idea was a mistake, I deleted it; the counterexample is `9999999999`. – kludg Jun 16 '14 at 16:19
-
-
Are you sure? Because `if Negative then Result:= -Result;` looks like it cannot yield `low(Integer)`. That's because `abs(low(Integer)) = abs(high(Integer)) + 1` – David Heffernan Jun 16 '14 at 17:04
-
-
I'm not running any code here. Just reading it. More fun that way. So what integer value x is such that -x equals low(Integer)? What am I missing? – David Heffernan Jun 16 '14 at 17:13
-
also for the sake of full disclosure the code should be compiled with {$Q-} option but this is the default option, no need to worry. – kludg Jun 16 '14 at 17:26
I followed this tip:
How to convert AnsiString to UnicodeString in Delphi XE4
Example:
var
a : AnsiString;
b : String;
c : Integer;
begin
a := '123';
b := String(a);
c := StrToInt(b);

- 326
- 2
- 16
-
Not really an answer to what was discussed here. See Davids answer for details. – Gabriel Sep 21 '21 at 08:38