2

The following program is to sort a large array of random numbers using heapsort. The output of the program is the total execution time of the recursive heapSort function(in microseconds). The size of the input array is defined by the SIZE macro.

The program works fine for SIZE up to 1 million(1000000). But when I try to execute the program with SIZE 10 million(10000000), the program generates segmentation fault(core dumped).

Note: I have already tried increasing the soft and hard limits of the stack using ulimit -s command on Linux(128 MB). The SEGFAULT still persists.

Please suggest me any alterations to the code needed or any method which will overcome the existing SEGFAULT malady without having to declare the array dynamically or as global/static. /* Program to implement Heap-Sort algorithm */

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <sys/time.h>

long SIZE = 10000000; // Or #define SIZE 10000000

long heapSize;

void swap(long *p, long *q)
{
    long temp = *p;
    *p = *q;
    *q = temp;
}

void heapify(long A[], long i)
{
    long left, right, index_of_max;
    left = 2*i + 1;
    right = 2*i + 2;

    if(left<heapSize && A[left]>A[i])
        index_of_max = left;
    else
        index_of_max = i;

    if(right<heapSize && A[right]>A[index_of_max])
        index_of_max = right;

    if(index_of_max != i)
    {
        swap(&A[index_of_max], &A[i]);
        heapify(A, index_of_max);
    }       
}

void buildHeap(long A[])
{
    long i;

    for(i=SIZE/2; i>=0 ; i--)
        heapify(A,i);
}

void heapSort(long A[])
{
    long i;

    buildHeap(A);

    for(i=SIZE-1 ; i>=1 ; i--)
    {
        swap(&A[i], &A[0]);
        heapSize--;
        heapify(A, 0);
    } 
}

int main()
{
    long i, A[SIZE];
    heapSize = SIZE;
    struct timespec start, end;

    srand(time(NULL));
    for(i = 0; i < SIZE; i++)
        A[i] = rand() % SIZE;

    /*printf("Unsorted Array is:-\n");
    for(i = 0; i < SIZE; i++)
        printf("%li\n", A[i]);
    */

    clock_gettime(CLOCK_MONOTONIC_RAW, &start);//start timer
    heapSort(A);
    clock_gettime(CLOCK_MONOTONIC_RAW, &end);//end timer

    //To find time taken by heapsort by calculating difference between start and stop time.
    unsigned long delta_us = (end.tv_sec - start.tv_sec) * 1000000 \
                            + (end.tv_nsec - start.tv_nsec) / 1000;

    /*printf("Sorted Array is:-\n");
    for(i = 0; i < SIZE; i++) 
        printf("%li\n", A[i]);
    */

    printf("Heapsort took %lu microseconds for sorting of %li elements\n",delta_us, SIZE);

    return 0;
}
Akash Das
  • 163
  • 15
  • 2
    Allocate the array dynamically using `malloc()` or something similar. – Jonathan Leffler Mar 01 '18 at 03:50
  • I want to do it without using malloc or global declaration or declaring the array as static. I have increased the size of the stack to 128 MB. But it still doesn't help. Can you suggest me of some other way? – Akash Das Mar 01 '18 at 04:06
  • 1
    Succinctly, no. How have you established the larger stack size? Are you sure it really took effect? I'd guess that it didn't because the code is crashing. There might be another reason, but I'd probably not spend as long as it is taking me to type up this comment thinking about it. I'd go ahead with dynamic allocation. – Jonathan Leffler Mar 01 '18 at 04:10
  • Thanks a lot. I went ahead with the dynamic implementation as well. – Akash Das Mar 01 '18 at 07:22
  • One additional advantage of dynamic allocation is that you can specify the size of the sort as a command line argument and run all the different sizes without having to recompile the program between runs. And it is barely harder to write than the code with a size fixed at compile time. – Jonathan Leffler Mar 01 '18 at 07:37
  • @JonathanLeffler True! – Akash Das Mar 02 '18 at 04:01

1 Answers1

2

So, once you plan to stick with stack-only approach, you have to understand who is the main consumer(s) of your stack space.

  • Player #1: Array A[] itself. Depending to the OS/build, it consumes approx. 40 or 80 Mb of stack. One-time only.
  • Player #2: Beware recursion! In your case, this is heapify() function. Each call consumes decent stack chunk to serve a calling convention, stack alignment like stack-frames etc. If you do that million times and tree-like schema, you have tens of megabytes spent here too. So, you can try to re-implement this function to non-recursive way to decrease stack size pressure.
Yury Schkatula
  • 5,291
  • 2
  • 18
  • 42
  • Thanks for the insight. I was dubious whether 128 MB would suffice for implementing recursive heapsort for an array of size 10 million. I believe the only way left to implement it without the use of global/static/dynamic arrays, is to use external heapsort. – Akash Das Mar 02 '18 at 04:03