2

I am trying two find the time complexity of a recursive function that merges n number of files.

My solution is T(n)=kc+T(n-(k+1)) where n > 0, T(n)=T(0) where n=0.

Is this correct or is there any other way of finding the time complexity?

Here is the pseudo code,

//GLOBAL VARIABLES
int nRecords = 0...k; //assume there are k records
int numFiles = 0...n; //assume there are n files
String mainArray[0...nRecords]; //main array that stores the file records

void mergeFiles(numFiles) { //params numFiles
  fstream file; //file variable
  if (numFiles == 0) {
    ofstream outfile; //file output variable
    outfile.open(directory / mergedfile); // point variable to directory
    for (int i = 0; i < sizeOf(mainArray); i++) {
      oufile << mainArray[i]; // write content of mainArray to outfile
    }
    outfile.close(); //close once operation is done
  } else {
    int i = 0; //file index counter
    file.open(directory / nextfile); //open file to be read

    if (file.isOpen()) {
      while (!file.eof() && i < sizeOf(mainArray)) {
        file >> mainArray[i]; //copy contents of file to mainArray
        i++; //increase array index
      }
    }
    file.close(); //close once operation is done
    mergeFiles(numFiles - 1); //recurse function
  }
}

int main() {
  mergeFiles(numFiles); //call mergeFile function to main
}
OmG
  • 18,337
  • 10
  • 57
  • 90
David Bale
  • 21
  • 1

2 Answers2

1

Going by your formula.

T(n)= kc+T(n-(k+1)) = kc+kc+T(n-(k+1)-(k+1)) = kc+kc+...+T(0) = ... 
    = kc*(n/(k+1)) ~ nc = O(n).
v78
  • 2,803
  • 21
  • 44
1

The definition of k is a bit ambiguous in your question, because the formula you provided for T(n) seems to assume you process k records per file, while the definition of mainArray in the code suggests that k represents the total number of records, not the number of records in an individual file.

I will first assume the second definition of k is the correct one, so you have:

  • n = number of files
  • k = total number of records in these files = size of array

Time complexity of read/write operations

I think you assume the following two statements -- which read/write one record -- run each in constant time:

file >> mainArray[i]; 

outfile << mainArray[i];    

Note that the time needed for such operations is generally dependent on the size of the record. But as you did not provide that size as something to consider, I will assume records have a constant size, and thus these operations can be considered to run in O(1), i.e. constant time.

About recursion

Although you use recursion, it really concerns tail-recursion, and so the time complexity is not any different as for an iterative algorithm. Either way, the else block is executed n times.

It is in fact not so straightforward to calculate the time complexity with a recursive formula, as you don't know how many records there are in one file, only in all files together. You can work around this, and artificially assume there are k/n records in each file, but I find it much more intuitive to perform the measurement based on the absolute number of times the else block is executed, without the need to express this in a recursive formula.

Measurements

The body of the inner while loop can in total execute k times at the most, and given that you assume there are just as many records in your files, it will execute exactly k times in total.

The final part (where numfiles == 0) has a for loop that also executes k times.

So the ingredients determining the time complexity are:

  • A constant time for opening/closing a file, multiplied by n
  • A constant time for reading/writing a record, multiplied by k

So the time complexity is O(n+k)

If definition of k is different

If k should denote the number of records in one file, then your code is wrong, as the size of the array has then to be n.k, instead of k. Suppose that you still intended that, then with a similar reasoning the time complexity is O(n.k)

Note concerning the correctness of the program

In real situation you would have to make sure the size of your array corresponds to the total number of records in your file, and not just assume it is the case. If the array turns out to be smaller you would not be able to store some records; and if on the other hand the array is greater, the code for dumping the array into the output file would include array elements that were never initialised.

You would therefore better use an array with a dynamic size (a stack), so its size corresponds exactly to the number of records that have been actually read into it.

Community
  • 1
  • 1
trincot
  • 317,000
  • 35
  • 244
  • 286