2
//
//  main.cpp
//  timsort
//
//  Created by Atharva Koli on 2019/1/27.
//  Copyright  © 2019 Atharva Koli. All rights reserved.
//

#include<bits/stdc++.h>

using namespace std;
const int RUN = 32;

// this function sorts array from left index to
// to right index which is of size atmost RUN
void insertionSort(int arr[], int left, int right)
{
    for (int i = left + 1; i <= right; i++)
    {
      int temp = arr[i];
      int j = i - 1;
      while (arr[j] > temp && j >= left)
      {
        arr[j+1] = arr[j];
        j--;
      }
      arr[j+1] = temp;
    }
}

// merge function merges the sorted runs
void merge(int arr[], int l, int m, int r)
{
  // original array is broken in two parts
  // left and right array
  int len1 = m - l + 1, len2 = r - m;
  int left[len1], right[len2];
  for (int i = 0; i < len1; i++)
    left[i] = arr[l + i];
  for (int i = 0; i < len2; i++)
    right[i] = arr[m + 1 + i];

  int i = 0;
  int j = 0;
  int k = l;


  // after comparing, we merge those two array
  // in larger sub array
  while (i < len1 && j < len2)
  {
    if (left[i] <= right[j])
    {
        arr[k] = left[i];
        i++;
    }
    else
    {
        arr[k] = right[j];
        j++;
    }
    k++;
  }

  // copy remaining elements of left, if any
  while (i < len1)
  {
    arr[k] = left[i];
    k++;
    i++;
  }

  // copy remaining element of right, if any
  while (j < len2)
  {
    arr[k] = right[j];
    k++;
    j++;
  }
}

// iterative Timsort function to sort the
// array[0...n-1] (similar to merge sort)
void timSort(int arr[], int n)
{
  // Sort individual subarrays of size RUN
  for (int i = 0; i < n; i+=RUN)
    insertionSort(arr, i, min((i+31), (n-1)));

  // start merging from size RUN (or 32). It will merge
  // to form size 64, then 128, 256 and so on ....
  for (int size = RUN; size < n; size = 2*size)
  {
    // pick starting point of left sub array. We
    // are going to merge arr[left..left+size-1]
    // and arr[left+size, left+2*size-1]
    // After every merge, we increase left by 2*size
    for (int left = 0; left < n; left += 2*size)
    {
        // find ending point of left sub array
        // mid+1 is starting point of right sub array
        int mid = left + size - 1;
        int right = min((left + 2*size - 1), (n-1));

        // merge sub array arr[left.....mid] &
        // arr[mid+1....right]
        merge(arr, left, mid, right);
    }
  }
}

// utility function to print the Array
void printArray(int arr[], int n)
{
  for (int i = 0; i < n; i++)
    printf("%d ", arr[i]);
  printf("\n");
}

// Driver program to test above function
int main()
{
  int arr[] = {"10 thousand numbers separated by commas"};
  int n = sizeof(arr)/sizeof(arr[0]);

  timSort(arr, n);

  printArray(arr, n);
  return 0;
}

This is a program I wrote which sorts the numbers contained in an array using the timsort algorithm. The program compiled and ran successfully using 100, 1000...numbers. However when I put 10,000 numbers or more in the array the program crashes giving me a compile error like so: Xcode Compile Error

I am still a beginner in C++ programming and the code above is combined using various sources online. Please assist me in identifying the issue!

The program is a standard implementation of timsort, efficiently combining arrays sorted using insertion sort.

bruno
  • 32,421
  • 7
  • 25
  • 37
unc0ver321
  • 29
  • 2
  • 1
    C and C++ are two different languages. The posted code is for C++ . Please remove the 'c' tag – user3629249 Apr 01 '19 at 07:50
  • 1
    why are you speaking about _compile error_, if your program crashes that means it was compiled, it is _execution error_. 10000 is not a lot even in stack, however try making the initial array static ( `static int arr[] = {...};`) – bruno Apr 01 '19 at 07:53
  • 1
    First of all read [Why should I not #include ?](https://stackoverflow.com/questions/31816095/why-should-i-not-include-bits-stdc-h) Secondly, C++ doesn't have [variable-length arrays](https://en.wikipedia.org/wiki/Variable-length_array), use `std::vector` instead. – Some programmer dude Apr 01 '19 at 07:56
  • As for your crash, what are the values of all involved variables at that point? Are the indexes valid for the arrays? My guess is that they aren't. – Some programmer dude Apr 01 '19 at 07:57

1 Answers1

2

You have out of bound problem, though it doesn't reveal itself instantly. Firstly you use non-standard extension called VLA inside of merge(), so how those arrays behaves in stack, I'm not sure. Secondly, the index math in merge results in

l,len1 =0,32
l,len1 =64,32
l,len1 =128,32
l,len1 =192,32
...
l,len1 =9472,32
l,len1 =9536,32
l,len1 =9600,32
l,len1 =9664,32
l,len1 =9728,32
l,len1 =9792,32
l,len1 =9856,32
l,len1 =9920,32
l,len1 =9984,32

9984 + 32 > 10000 , so accessing arr[l + i] results in reading past the end of array. After few iteration it reaches state where that would cause segmentation fault. Try print your variables to debug it.

Swift - Friday Pie
  • 12,777
  • 2
  • 19
  • 42
  • 1
    I have just caught a bounds error in function `merge` when the sum `l + i` was >= the length of the array. Time for @unc0ver321 to deploy a debugger?? – Weather Vane Apr 01 '19 at 08:16
  • @WeatherVane and valgrind may be. To revise the algorithm part. Debugger might be tedious in this case, i prefer logging (and if logging causes the cat to stay alive, then we clearly have UB somewhere) – Swift - Friday Pie Apr 01 '19 at 13:36
  • 1
    Not valgrind or debugger here, just some additional coding to check array index ranges, with the array length made available globally. – Weather Vane Apr 01 '19 at 13:40
  • @WeatherVane yeah, that's what I used to try and find reason of crash. At some stages it even ran without crashing. I'm not very familiar with timSort and its modes, so I'm not offering a solution. But the error actually resembles typical bugs of implementation (one of them - out of bound, another would be index overflow) – Swift - Friday Pie Apr 01 '19 at 17:23
  • I found that with small samples, `merge` isn't always called although I can't quite see why. Also it is not guarantted to break the bounds. That might depend on the data being processed. – Weather Vane Apr 01 '19 at 17:29
  • @Weather Vane technically timSort is adaptive and in best case results in N operations. but I don't see some canon parts. For N<64 it never actually should run merge. For size of subarray less that RUN it shouldn't do anything because that subarray is already sorted, etc. There is no minRun() function in this implementation, so I'm a bit lost. But separation is done wrongly – Swift - Friday Pie Apr 01 '19 at 17:51