14

I was trying to speed up a certain routine in an application, and my profiler, AQTime, identified one method in particular as a bottleneck. The method has been with us for years, and is part of a "misc"-unit:

function cwLeftPad(aString:string; aCharCount:integer; aChar:char): string;
var
  i,vLength:integer;
begin
  Result := aString;
  vLength := Length(aString);
  for I := (vLength + 1) to aCharCount do    
    Result := aChar + Result;
end;

In the part of the program that I'm optimizing at the moment the method was called ~35k times, and it took a stunning 56% of the execution time!

It's easy to see that it's a horrible way to left-pad a string, so I replaced it with

function cwLeftPad(const aString:string; aCharCount:integer; aChar:char): string; 
begin
  Result := StringOfChar(aChar, aCharCount-length(aString))+aString;
end;

which gave a significant boost. Total running time went from 10,2 sec to 5,4 sec. Awesome! But, cwLeftPad still accounts for about 13% of the total running time. Is there an easy way to optimize this method further?

Svein Bringsli
  • 5,640
  • 7
  • 41
  • 73
  • Do you have any data about how much of the time is spent in each of the RTL functions involved in your function? Say, what percentage is spent allocating memory and what is spent copying characters? – Rob Kennedy Nov 05 '09 at 09:57
  • Are you working on D2009 or later, i.e. are you working with string=ansistring or with unicode strings? – PhiS Nov 05 '09 at 11:08
  • What is typical input for this function? If you have a limited set of real-world inputs, then the algorithm can be tweaked in a way that might be slower for the general case, but will be faster for you. Wodzu has an extreme example. – JosephStyons Nov 05 '09 at 13:58
  • Typically strings from 5-15 chars are padded to 20-50 chars. – Svein Bringsli Nov 05 '09 at 14:10
  • Is padding char different each time, is it one padding char or many, f.e: '.', 0', '#'. – Wodzu Nov 05 '09 at 14:29
  • Mainly spaces and zeros, but it varies. Anyway, it's fast enough now. It's down to 2% of the total running time, so I'll spend my time elsewhere :-) – Svein Bringsli Nov 05 '09 at 14:46

9 Answers9

13

Your new function involves three strings, the input, the result from StringOfChar, and the function result. One of them gets destroyed when your function returns. You could do it in two, with nothing getting destroyed or re-allocated.

  1. Allocate a string of the total required length.
  2. Fill the first portion of it with your padding character.
  3. Fill the rest of it with the input string.

Here's an example:

function cwLeftPad(const aString: AnsiString; aCharCount: Integer; aChar: AnsiChar): AnsiString;
var
  PadCount: Integer;
begin
  PadCount := ACharCount - Length(AString);
  if PadCount > 0 then begin
    SetLength(Result, ACharCount);
    FillChar(Result[1], PadCount, AChar);
    Move(AString[1], Result[PadCount + 1], Length(AString));
  end else
    Result := AString;
end;

I don't know whether Delphi 2009 and later provide a double-byte Char-based equivalent of FillChar, and if they do, I don't know what it's called, so I have changed the signature of the function to explicitly use AnsiString. If you need WideString or UnicodeString, you'll have to find the FillChar replacement that handles two-byte characters. (FillChar has a confusing name as of Delphi 2009 since it doesn't handle full-sized Char values.)

Another thing to consider is whether you really need to call that function so often in the first place. The fastest code is the code that never runs.

Rob Kennedy
  • 161,384
  • 21
  • 275
  • 467
  • Afaik D2009 doesn't. FPC provides fillword/dword/qword – Marco van de Voort Nov 05 '09 at 14:14
  • Making it a VAR procedure instead of a function might make it slightly faster (if the string has refcount 1 and is allocated, and can be enlarged/made smaller, the string allocation is cheaper). At the cost of a bit of easy of use maybe. – Marco van de Voort Nov 05 '09 at 21:57
  • 1
    Marco, functions returning strings are turned into var procedures by the compiler anyway. (Refer to the many reports from puzzled developers where Result holds the value from a previous call instead of being an empty string like ordinary local variables would be.) – Rob Kennedy Nov 05 '09 at 23:06
  • In Delphi 2009, FillChar won't work. It expects a byte count and expects the fill character to be a single byte character and will fill each byte with it. The Delphi 2009 help for FillChar suggest using StringOfChar instead which is in the System unit and is written in assembler so it has obviously been optimized and should do the trick. – lkessler Nov 06 '09 at 04:01
  • FillChar will work fine *for my function* in all versions of Delphi because, as I noted, my function uses AnsiString. For UnicodeString, find a FillWord or FillWideChar function; there's one in JclWideFormat.pas, for example. – Rob Kennedy Nov 06 '09 at 06:26
  • Rob Kennedy: I missed your answer back then, but the implicit VAR might work differently with reference counting types like string. (e.g. dec refcount before passing to the implicit VAR argument) – Marco van de Voort Apr 30 '14 at 08:12
6

Another thought - if this is Delphi 2009 or 2010, disable "String format checking" in Project, Options, Delphi Compiler, Compiling, Code Generation.

gabr
  • 26,580
  • 9
  • 75
  • 141
4

StringOfChar is very fast and I doubt you can improve this code a lot. Still, try this one, maybe it's faster:

function cwLeftPad(aString:string; aCharCount:integer; aChar:char): string;
var
  i,vLength:integer;
  origSize: integer;
begin
  Result := aString;
  origSize := Length(Result);
  if aCharCount <= origSize then
    Exit;
  SetLength(Result, aCharCount);
  Move(Result[1], Result[aCharCount-origSize+1], origSize * SizeOf(char));
  for i := 1 to aCharCount - origSize do
    Result[i] := aChar;
end;

EDIT: I did some testing and my function is slower than your improved cwLeftPad. But I found something else - there's no way your CPU needs 5 seconds to execute 35k cwLeftPad functions except if you're running on PC XT or formatting gigabyte strings.

I tested with this simple code

for i := 1 to 35000 do begin
  a := 'abcd1234';
  b := cwLeftPad(a, 73, '.');
end;

and I got 255 milliseconds for your original cwLeftPad, 8 milliseconds for your improved cwLeftPad and 16 milliseconds for my version.

gabr
  • 26,580
  • 9
  • 75
  • 141
  • **Total** running time was 5.4 seconds. The string-padding function was 13% of that. That's 0.7 seconds, though, which is still pretty high if you're seeing 0.008. – Rob Kennedy Nov 05 '09 at 10:06
  • Probably the 8ms was the accomulated time of all cwLeftPad calls in the execution time – Runner Nov 05 '09 at 10:24
  • 8 ms is 35.000 string assignments (from a constant - very fast, I presume) and 35.000 cwLeftPad calls. – gabr Nov 05 '09 at 10:47
  • gabr, I experienced the same thing as you when I made a small test project. The strings where even padded to a shorter length (25 chars), which would make the two methods more equal. I'm starting to believe the profiler is fooling around with me :-) One thing that might clarify things is that the numbers in the question was from a debug version, where I habitually turn off code generation optimization. When I repeat the tests with optimization on the old method takes about 20% of the total running time, while the new version takes just slightly over 2% of the total time. – Svein Bringsli Nov 05 '09 at 13:13
  • sveinbringsli: Big Warning! Do not trust AQTime for microoptimization. See: http://stackoverflow.com/questions/332948/why-is-charinset-faster-than-case-statement – lkessler Nov 06 '09 at 04:06
2

You call StringOfChar every time now. Of course this method checks if it has something to do and jumps out if length is small enough, but maybe the call to StringOfChar is time consuming, because internally it does another call before jumping out.

So my first idea would be to jump out by myself if there is nothing to do:

function cwLeftPad(const aString: string; aCharCount: Integer; aChar: Char;): string;
var
  l_restLength: Integer;
begin
  Result  := aString;
  l_restLength := aCharCount - Length(aString);
  if (l_restLength < 1) then
    exit;

  Result := StringOfChar(aChar, l_restLength) + aString;
end;
Sebastian P.R. Gingter
  • 5,955
  • 3
  • 31
  • 73
  • You can get around the overhead of the call by using the Inline Directive on a copy of the StringOfChar routine from the System unit. Or you if you know a little assembler, you can insert the assembler directly into the cwLeftPad function yourself, without the overhead of the PUSH and POP statements. – lkessler Nov 06 '09 at 04:17
2

You can speed up this routine even more by using lookup array.

Of course it depends on your requirements. If you don't mind wasting some memory... I guess that the function is called 35 k times but it has not 35000 different padding lengths and many different chars.

So if you know (or you are able to estimate in some quick way) the range of paddings and the padding chars you could build an two-dimensional array which include those parameters. For the sake of simplicity I assume that you have 10 different padding lengths and you are padding with one character - '.', so in example it will be one-dimensional array.

You implement it like this:

type
  TPaddingArray = array of String;

var
  PaddingArray: TPaddingArray;
  TestString: String;

function cwLeftPad4(const aString:string; const aCharCount:integer; const aChar:char; var anArray: TPaddingArray ): string;
begin
  Result := anArray[aCharCount-length(aString)] + aString;
end;

begin
  //fill up the array
  SetLength(StrArray, 10);
  PaddingArray[0] := '';
  PaddingArray[1] := '.';
  PaddingArray[2] := '..';
  PaddingArray[3] := '...';
  PaddingArray[4] := '....';
  PaddingArray[5] := '.....';
  PaddingArray[6] := '......';
  PaddingArray[7] := '.......';
  PaddingArray[8] := '........';
  PaddingArray[9] := '.........';

  //and you call it..
  TestString := cwLeftPad4('Some string', 20, '.', PaddingArray);
end;

Here are benchmark results:

Time1 - oryginal cwLeftPad          : 27,0043604142394 ms.
Time2 - your modyfication cwLeftPad : 9,25971967336897 ms.
Time3 - Rob Kennedy's version       : 7,64538131122457 ms.
Time4 - cwLeftPad4                  : 6,6417059620664 ms.

Updated benchmarks:

Time1 - oryginal cwLeftPad          : 26,8360194218451 ms.
Time2 - your modyfication cwLeftPad : 9,69653117046119 ms.
Time3 - Rob Kennedy's version       : 7,71149259179622 ms.
Time4 - cwLeftPad4                  : 6,58248533610693 ms.
Time5 - JosephStyons's version      : 8,76641780969192 ms.

The question is: is it worth the hassle?;-)

Dima Zorin
  • 80
  • 2
  • 8
Wodzu
  • 6,932
  • 10
  • 65
  • 105
  • What if I want to pad with zeros, rather than dots? :-) – Svein Bringsli Nov 05 '09 at 13:41
  • As I said in my answer, if you know which char/chars you are padding you build specific array for it. Do you need more elaborate example which allows multiple characters? :) – Wodzu Nov 05 '09 at 13:42
  • 1
    You are right, and I apologize. I didn't read your introduction well enough, just the code. But anyway, why then did you leave the aChar-parameter in the function? :-) – Svein Bringsli Nov 05 '09 at 13:46
  • Ah! Thanks @sveinbringsli I haven't noticed that :) – Wodzu Nov 05 '09 at 13:48
  • Just for info: Factually this is not thread safe function and approach. So I vote for Rob's answer even this method is potentially safe. 1ms of speedup is not important. Also there a lack of any input args check and unsafe access to array. – Dima Zorin Jan 09 '13 at 14:07
  • Thanks for your comment. I had a hard time remembering when I was posting this answer! I find some of your points valid. Thread safety can be easy acquired by declaring a constant PaddingArray. As of 1ms I think it might have meaning especially for 35k runs, it will give a gain of 3.5 second. Of course if my benchmark is not flawed. I do not remember how I've measured those times and I should have post the benchmark code to the public. – Wodzu Jan 10 '13 at 19:36
1

It's possible that it may be quicker to use StringOfChar to allocate an entirely new string the length of string and padding and then use move to copy the existing text over the back of it.
My thinking is that you create two new strings above (one with FillChar and one with the plus). This requires two memory allocates and constructions of the string pseudo-object. This will be slow. It may be quicker to waste a few CPU cycles doing some redundant filling to avoid the extra memory operations.
It may be even quicker if you allocated the memory space then did a FillChar and a Move, but the extra fn call may slow that down.
These things are often trial-and-error!

sinibar
  • 81
  • 5
  • There's no "extra function call"; StringOfChar calls FillChar anyway. – Rob Kennedy Nov 05 '09 at 09:54
  • 1
    Fair enough! So SetLength(), Fillchar(left hand side), Move(right hand side) should be even quicker. TBH it's been a few years since I programmed Delphi and I don't remember the StringOfChar fn at all. I notice now BTW that the initial string is passed in by value. IIRC (and I may not) in Delphi this means that it's cloned. It might be worth passing this in by reference. The coding standards people may feel disposed to beat you to death for it, but it should be quicker. – sinibar Nov 05 '09 at 10:28
  • @sinibar - pass by ref: Yes, aString should be passed as const. Otherwise there's unnecessary reference count management (however no cloning). – Uli Gerhardt Nov 05 '09 at 10:45
1

You can get dramatically better performance if you pre-allocate the string.

function cwLeftPadMine
{$IFDEF VER210}  //delphi 2010
(aString: ansistring; aCharCount: integer; aChar: ansichar): ansistring;
{$ELSE}
(aString: string; aCharCount: integer; aChar: char): string;
{$ENDIF}
var
  i,n,padCount: integer;
begin
  padCount := aCharCount - Length(aString);

  if padCount > 0 then begin
    //go ahead and set Result to what it's final length will be
    SetLength(Result,aCharCount);
    //pre-fill with our pad character
    FillChar(Result[1],aCharCount,aChar);

    //begin after the padding should stop, and restore the original to the end
    n := 1;
    for i := padCount+1 to aCharCount do begin
      Result[i] := aString[n];
    end;
  end
  else begin
    Result := aString;
  end;
end;

And here is a template that is useful for doing comparisons:

procedure TForm1.btnPadTestClick(Sender: TObject);
const
  c_EvalCount = 5000;  //how many times will we run the test?
  c_PadHowMany = 1000;  //how many characters will we pad
  c_PadChar = 'x';  //what is our pad character?
var
  startTime, endTime, freq: Int64;
  i: integer;
  secondsTaken: double;
  padIt: string;
begin
  //store the input locally
  padIt := edtPadInput.Text;

  //display the results on the screen for reference
  //(but we aren't testing performance, yet)
  edtPadOutput.Text := cwLeftPad(padIt,c_PadHowMany,c_PadChar);

  //get the frequency interval of the OS timer    
  QueryPerformanceFrequency(freq);

  //get the time before our test begins
  QueryPerformanceCounter(startTime);

  //repeat the test as many times as we like
  for i := 0 to c_EvalCount - 1 do begin
    cwLeftPad(padIt,c_PadHowMany,c_PadChar);
  end;

  //get the time after the tests are done
  QueryPerformanceCounter(endTime);

  //translate internal time to # of seconds and display evals / second
  secondsTaken := (endTime - startTime) / freq;
  if secondsTaken > 0 then begin
    ShowMessage('Eval/sec = ' + FormatFloat('#,###,###,###,##0',
      (c_EvalCount/secondsTaken)));
  end
  else begin
    ShowMessage('No time has passed');
  end;
end;

Using that benchmark template, I get the following results:

The original: 5,000 / second
Your first revision: 2.4 million / second
My version: 3.9 million / second
Rob Kennedy's version: 3.9 million / second
JosephStyons
  • 57,317
  • 63
  • 160
  • 234
  • Yep, I do something like that now. Very similar to Rob's answer (which I had already accepted when I saw your answer) – Svein Bringsli Nov 05 '09 at 13:37
  • @JosephStyons Dramatically compared to which version? See my benchmark tests. – Wodzu Nov 05 '09 at 13:39
  • @Wodzu, dramatically compared to his original post. Pre-caching results as you do in your example will undoubtedly be faster.. as you said, though, "is it worth it". – JosephStyons Nov 05 '09 at 13:57
1

This is my solution. I use StringOfChar instead of FillChar because it can handle unicode strings/characters:

function PadLeft(const Str: string; Ch: Char; Count: Integer): string;
begin
  if Length(Str) < Count then
  begin
    Result := StringOfChar(Ch, Count);
    Move(Str[1], Result[Count - Length(Str) + 1], Length(Str) * SizeOf(Char));
  end
  else Result := Str;
end;

function PadRight(const Str: string; Ch: Char; Count: Integer): string;
begin
  if Length(Str) < Count then
  begin
    Result := StringOfChar(Ch, Count);
    Move(Str[1], Result[1], Length(Str) * SizeOf(Char));
  end
  else Result := Str;
end;
0

It's a bit faster if you store the length of the original string in a variable:

function PadLeft(const Str: string; Ch: Char; Count: Integer): string;
var
  Len: Integer;
begin
  Len := Length(Str);
  if Len < Count then
  begin
    Result := StringOfChar(Ch, Count);
    Move(Str[1], Result[Count - Len + 1], Len * SizeOf(Char));
  end
  else Result := Str;
end;

function PadRight(const Str: string; Ch: Char; Count: Integer): string;
var
  Len: Integer;
begin
  Len := Length(Str);
  if Len < Count then
  begin
    Result := StringOfChar(Ch, Count);
    Move(Str[1], Result[1], Len * SizeOf(Char));
  end
  else Result := Str;
end;