I have a program that is supposed to retrieve (on start-up) data from a text file. This file may get huge and I was wondering how I could speed up the process and assess its current performance. The code used to retrieve data is as follow:
void startUpBillsLoading(Bill *Bills)
{
FILE *BillsDb = 0, *WorkersDb = 0, *PaymentDb = 0;
BillsDb = fopen("data/bills.db", "r");
WorkersDb = fopen("data/workers.db", "r");
PaymentDb = fopen ("data/payments.db", "r");
char *Buffer = malloc (512);
if (BillsDb && WorkersDb && PaymentsDb)
{
int i = 0, j = 0;
while (fscanf (BillsDb, "%d;%[^;];%[^;];%[^;];%[^;];%d/%d/%d;%d/%d/%d;%d;%f;%f\n",
&Bills[i].Id,
Bills[i].CompanyName,
Bills[i].ClientName,
Bills[i].DepartureAddress,
Bills[i].ShippingAddress,
&Bills[i].Creation.Day,
&Bills[i].Creation.Month,
&Bills[i].Creation.Year,
&Bills[i].Payment.Day,
&Bills[i].Payment.Month,
&Bills[i].Payment.Year,
&Bills[i].NumWorkers,
&Bills[i].TotalHT,
&Bills[i].Charges) == 14)
{
Bills[i].Workers =
malloc (sizeof(Employee)*Bills[i].NumWorkers);
fscanf (PaymentDb, "%d;%d;%[^;];%[^;];%[^\n]\n",
&Bills[i].Id,
&Bills[i].PaymentDetails.Method,
Bills[i].PaymentDetails.CheckNumber,
Bills[i].PaymentDetails.VirementNumber,
Bills[i].PaymentDetails.BankName);
LatestBillId++;
i++;
}
i = 0;
while (fscanf (WorkersDb, "%d;%[^;];%[^;];%f\n",
&Bills[i].Id,
Bills[i].Workers[j].Surname,
Bills[i].Workers[j].Name,
&Bills[i].Workers[j].Salary) == 4)
{
for (int j = 1; j <= Bills[i].NumWorkers-1; j++)
{
fscanf (WorkersDb, "%d;%[^;];%[^;];%f\n",
&Bills[i].Id,
Bills[i].Workers[j].Surname,
Bills[i].Workers[j].Name,
&Bills[i].Workers[j].Salary);
}
i++;
}
fclose(BillsDb);
fclose(WorkersDb);
fclose(PaymentDb);
}
else
printf ("\t\t\tImpossible d'acceder aux factures !\n");
free (Buffer);
}
I have used the time.h
library to measure the time it takes to retrieve all the required data.
A Bill's data is separated in 3 files: bills.db, workers.db and payments.db. Each file line from the bills.db
and from payments.db
represents an entire bill whereas in workers.db
the amount of lines required to represent a bill is variable and depends on the numbers of employees that are related to the bill.
I created these 3 files in this way:
bills.db
andpayments.db
has 118087 lines (thus as many bills)- Each bill was set (arbitrarily) to have 4 workers therefore, the
workers.db
file has 118087*4 = 472348 lines.
The time taken by this function to run completely is around 0.9 seconds. How good (or bad) is this time and how to improve it ?