I maintain an application in Delphi 7 which have a server part that can be compiled with CrossKylix. For performance matter I'm benching multiThreading and Critical section use.
I made a console application that create 100 TThread and each TThread compute a fibonacci. Then I add a critical section so that only one thread compute a fibonacci at a time. As expected, the application is faster without the Critical section.
Then I made a console application that create 100 TThread and each TThread add words in a local TStringList and sort that TStringList. Then I add a critical section so that only one thread is executing at a time. On Windows, as expected, the application runs faster without the Critical section. On Linux the CriticalSection version runs 2 times faster than the version without Critical Section.
The CPU on Linux is an AMD Opteron with 6 cores so the app should benefit from multithreading.
Can somebody explain why the version with Critical section is faster?
Edit (add some code)
Threads creation and waiting
tmpDeb := Now;
i := NBTHREADS;
while i > 0 do
begin
tmpFiboThread := TFiboThread.Create(true);
tmpFiboThread.Init(i, ParamStr(1) = 'Crit');
Threads.AddObject(IntToStr(i), tmpFiboThread);
i := i-1;
end;
i := 0;
while i < NBTHREADS do
begin
TFiboThread(Threads.Objects[i]).Resume;
i := i+1;
end;
i := 0;
while i < NBTHREADS do
begin
TFiboThread(Threads.Objects[i]).WaitFor;
i := i+1;
end;
WriteLn('Traitement total en : ' + inttostr(MilliSecondsBetween(Now, tmpDeb)) + ' milliseconds');
The TThread and Critical section use
type TFiboThread = class(TThread)
private
n : Integer;
UseCriticalSection : Boolean;
protected
procedure Execute; override;
public
ExecTime : Integer;
procedure Init(n : integer; WithCriticalSect : Boolean);
end;
var
CriticalSection : TCriticalSection;
implementation
uses DateUtils;
function fib(n: integer): integer;
var
f0, f1, tmpf0, k: integer;
begin
f1 := n + 100000000;
IF f1 >1 then
begin
k := f1-1;
f0 := 0;
f1 := 1;
repeat
tmpf0 := f0;
f0 := f1;
f1 := f1+tmpf0;
dec(k);
until k = 0;
end
else
IF f1 < 0 then
f1 := 0;
fib := f1;
end;
function StringListSort(n: integer): integer;
var
tmpSL : TStringList;
i : Integer;
begin
tmpSL := TStringList.Create;
i := 0;
while i < n + 10000 do
begin
tmpSL.Add(inttostr(MilliSecondOf(now)));
i := i+1;
end;
tmpSL.Sort;
Result := StrToInt(tmpSL.Strings[0]);
tmpSL.Free;
end;
{ TFiboThread }
procedure TFiboThread.Execute;
var
tmpStr : String;
tmpDeb : TDateTime;
begin
inherited;
if Self.UseCriticalSection then
CriticalSection.Enter;
tmpDeb := Now;
tmpStr := inttostr(fib(Self.n));
//tmpStr := inttostr(StringListSort(Self.n));
Self.ExecTime := MilliSecondsBetween(Now, tmpDeb);
if Self.UseCriticalSection then
CriticalSection.Leave;
Self.Terminate;
end;
procedure TFiboThread.Init(n : integer; WithCriticalSect : Boolean);
begin
Self.n := n;
Self.UseCriticalSection := WithCriticalSect;
end;
initialization
CriticalSection := TCriticalSection.Create;
finalization
FreeAndNil(CriticalSection);
Edit 2
I read this why-using-more-threads-makes-it-slower-than-using-less-threads so as I understand this, the context switching cost a lot more CPU resource with Linux and Kylix compilation than context switching with win32.