-2

The average case time complexity for a quick sort algorithm is O(nlogn).That means it should be strictly increasing with increasing value of n. But why the time is increasing-decreasing-increasing in this case. Why is it decreasing with increasing value of n

I am generating random inputs using the function of java

What will be the solution for this??

public class Qsort {
static Scanner sc = new Scanner(System.in);
public static void main(String args[]) throws IOException {
File file = new File("Qtest.txt");
PrintWriter output = new PrintWriter(file);
int n,i;
System.out.println("Enter the times of iteration");
n = sc.nextInt();
for(int j=1000;j<=n*1000;j=j+1000)
{
    Random r = new Random();
    int array[];
    array = new int[j];
    for(i=0;i<j;i++)
    {
        array[i] = r.nextInt(2000);
    }
    long startTime = System.nanoTime();
    Qsort(array,0,j-1);
    long endTime = System.nanoTime();
    output.println(j + " " + (endTime-startTime));
    System.out.println("After sorting the time elapsed is " +(endTime-startTime));
}
output.close();
}

The sort function for Quick Sort

public static void Qsort(int A[],int start,int end)
{
    if(start>=end)  return;
    int p = partition(A,start,end);
    Qsort(A,start,p-1);
    Qsort(A,p+1,end);
}
public static int partition(int A[],int start,int end)
{
    int pivot = A[end];
    int p = start;
    for(int i =start;i<end;i++)
    {
        if(A[i]<=pivot)
        {
            int temp = A[i];
            A[i] = A[p];
            A[p] = temp;
            p++;
        }
    }
    int temp = A[p];
    A[p] = A[end];
    A[end] = temp;
    return p;
}
}

The output that I am getting:

Enter the times of iteration 10

For 1000 inputs : After sorting the time elapsed is 779133

For 2000 inputs : After sorting the time elapsed is 8350639

For 3000 inputs : After sorting the time elapsed is 607856

For 4000 inputs : After sorting the time elapsed is 833593

For 5000 inputs : After sorting the time elapsed is 1042426

For 6000 inputs : After sorting the time elapsed is 1283195

For 7000 inputs : After sorting the time elapsed is 1497488

For 8000 inputs : After sorting the time elapsed is 1261182

For 9000 inputs : After sorting the time elapsed is 1207128

For 10000 inputs : After sorting the time elapsed is 1427456

  • can you please explain it –  Mar 03 '18 at 04:11
  • How can I clear the cache everytime my loop runs for the new value of j –  Mar 03 '18 at 04:12
  • 2
    As you use random values also the real time taken is more random. You also only run one sort per size. This will not give you a reliable result. Run it a million times per size. – Sami Kuhmonen Mar 03 '18 at 04:14
  • When I am running it for only one value of j. These are the outputs I m getting each time... After sorting the time elapsed for 5000 is 2415390, After sorting the time elapsed for 10000is 3281184, After sorting the time elapsed for 15000is 2624570, why is it decreasing even when I am running it separately each time –  Mar 03 '18 at 04:22
  • Read this: https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java ... and the various sources the answers link to. – Stephen C Mar 03 '18 at 05:39
  • Also see [this SO question](https://stackoverflow.com/q/95635) for more general info about what a JIT compiler does. – AnOccasionalCashew Mar 03 '18 at 06:54
  • Possible duplicate of [Why is Merge Sort in Java getting faster on same Array](https://stackoverflow.com/questions/32997680/why-is-merge-sort-in-java-getting-faster-on-same-array) – AnOccasionalCashew Mar 03 '18 at 06:55
  • Your are not analyzing for time complexity. You are *measuring* for *elapsed time.* These things are completely different. – user207421 Mar 03 '18 at 07:29
  • lets consider it elapsed time only, which should be strictly increasing with increasing value of n.Now Even if I am running it separately for j=5000,10000,15000,20000 still the output is irregular.Why is it so? –  Mar 03 '18 at 13:43

1 Answers1

2

There are two things conflated in your question: measuring algorithmic complexity and microbenchmarking.

Algorithmic time complexity works with the number of operations. What you are measuring is the wall time spent in quick sort - which makes it a microbenchmark test.

Microbenchmarking is hard. On the first run you are most probably seeing JVM warmup time. On the second drop it might be instruction caching or JIT compilation. I would suggest to rerun the test in a way that before you run it, do a sort on 10000 elements! Other things might also influence the results, such as CPU utilization in your machine, the randomness in your arrays (sometimes the swap happens sometimes it doesn't) - that usually requires the microbenchmark to run each "experiment" multiple times and draw conclusions only in a statistical way, using standard error, percentiles, etc. instead of drawing it from single experiments.

Bottom line: warmup does help remove some of the JVM specific noise, and number of operations is a more precise measurement of time complexity.

The code below is a version of your code that calculates number of operations and does a warmup round:

public class QSort {
    private static int numOps = 0;
    static Scanner sc = new Scanner(System.in);

    public static void main(String args[]) throws IOException {
        File file = new File("Qtest.txt");
        PrintWriter output = new PrintWriter(file);
        int n, i;
        System.out.println("warming up...");
        Qsort(randomInts(1000000), 0, 100000-1);
        System.out.println("Enter the times of iteration");
        n = sc.nextInt();
        for (int j = 1000; j <= n * 1000; j = j + 1000) {
            int[] array = randomInts(j);
            long startTime = System.nanoTime();
            numOps = 0;
            Qsort(array, 0, j - 1);
            long endTime = System.nanoTime();
            output.println(j + " " + (endTime - startTime) + " " + numOps);
            System.out.println("After sorting the time elapsed is " + (endTime - startTime) + " numOps: " + numOps);
        }
        output.close();
    }

    private static int[] randomInts(int j) {
        int i;Random r = new Random();
        int array[];
        array = new int[j];
        for (i = 0; i < j; i++) {
            array[i] = r.nextInt(2000);
        }
        return array;
    }

    public static void Qsort(int A[], int start, int end) {
        if (start >= end) return;
        int p = partition(A, start, end);
        Qsort(A, start, p - 1);
        Qsort(A, p + 1, end);
    }

    public static int partition(int A[], int start, int end) {
        int pivot = A[end];
        int p = start;
        for (int i = start; i < end; i++) {
            if (A[i] <= pivot) {
                int temp = A[i];
                A[i] = A[p];
                A[p] = temp;
                p++;
                numOps++;
            }
        }
        int temp = A[p];
        A[p] = A[end];
        A[end] = temp;
        return p;
    }
}

The output:

warming up...
Enter the times of iteration
20
After sorting the time elapsed is 94206 numOps: 5191
After sorting the time elapsed is 150524 numOps: 12718
After sorting the time elapsed is 232478 numOps: 20359
After sorting the time elapsed is 314819 numOps: 31098
After sorting the time elapsed is 475933 numOps: 38483
After sorting the time elapsed is 500866 numOps: 55114
After sorting the time elapsed is 614642 numOps: 57251
After sorting the time elapsed is 693324 numOps: 68683
After sorting the time elapsed is 738800 numOps: 83332
After sorting the time elapsed is 798644 numOps: 83057
After sorting the time elapsed is 899891 numOps: 99975
After sorting the time elapsed is 987163 numOps: 113854
After sorting the time elapsed is 1059323 numOps: 124735
After sorting the time elapsed is 1103815 numOps: 143278
After sorting the time elapsed is 1192974 numOps: 164740
After sorting the time elapsed is 1276277 numOps: 166781
After sorting the time elapsed is 1344138 numOps: 180460
After sorting the time elapsed is 1439943 numOps: 204095
After sorting the time elapsed is 1593336 numOps: 209483
After sorting the time elapsed is 1644561 numOps: 225523
Balint Pato
  • 1,497
  • 1
  • 13
  • 28
  • if this is the case of instruction caching and JIT ,then why is it like even if i run once foreach value of j i.e rin separately for 5000,10000,15000,20000.The time is still irregular. –  Mar 03 '18 at 04:50
  • well, if you run it only once, separately that makes it even worse - every single run is now "in the warmup phase" making the wall clock time even less predictable. I added more to the answer - suggesting measuring actual number of operations instead of wall time which is always a bit messy. – Balint Pato Mar 03 '18 at 13:44
  • "making the wall clock time even less predictable" so now if every time the instructions are fresh to the processor then the time should increase for every new no of inputs –  Mar 03 '18 at 14:58
  • The measurements won't necessarily increase in a uniform way, because it is hard to say when the "warmup" phase exactly ends. It might be in the middle of your iterations, or at the end of it. That's one of the largest factors that makes the measurement of _elapsed time_ (wall clock time) less predictable. – Balint Pato Mar 03 '18 at 15:08
  • Himanshu, if you think this was satisfactory, do you mind accepting my answer? Or do you have other questions? – Balint Pato Mar 03 '18 at 19:18