1

I wrote a simple insertion sort implementation to try to knock the rust of, and begin what I hope is a better understading of algortihms in general. The file holds 20 million random numbers. The code is below:

#include <fstream>
#include <time.h>
#include <cstdlib>

using namespace std;

void insertionSort(double numbers[], int array_size);

int main() {
    int count;
    double twentyMMNumbers[20000000];
    ifstream inputFile;
    time_t now;
    time(&now);

    inputFile.open("20KRandomsNumbers.data");       //Opens input file
    if(inputFile.fail())
    {
        printf("Cannot open inputFile");
        exit(1);
    }

    count = 0;
    printf("%d\n",count);
    inputFile >> twentyMMNumbers[count];
    printf("%f\n",twentyMMNumbers[count]);
    while(inputFile)
    {   //While loop
        count++;
        if(count < 20000000)
            inputFile >> twentyMMNumbers[count];
    }
    inputFile.close();  

    printf("%s\n", ctime(&now));    //BEFORE
    insertionSort(twentyMMNumbers, 20000000); //Insertion Sort 20KRandomNumbers
    printf("%s\n", ctime(&now)); //AFTER
    for(int i = 0; i < count; i++)
        printf("%f\n",twentyMMNumbers[i]);  
}

void insertionSort(double numbers[], int array_size)
{
  int i, j, index;
  for (i=1; i < array_size; i++)
  {
    index = numbers[i];
    j = i;
    while ((j > 0) && (numbers[j-1] > index))
    {
      numbers[j] = numbers[j-1];
      j = j - 1;
    }
    numbers[j] = index;
  }
}

The code worked fine when it only had 20,000 entries, but now gives me:

Segmentation fault: 11

Was this caused by my increase in the size of the array? PS if you have any tips on optimizing this, feels free to point it out.

palacsint
  • 28,416
  • 10
  • 82
  • 109
FossilizedCarlos
  • 197
  • 2
  • 10

3 Answers3

2

Ironically enough (given this site name), you have a stack overflow. You'll need to dynamically allocate that much memory on the heap.

For more clarity, this line:

double twentyMMNumbers[20000000];

needs to be

double* twentyMMNumbers = (double*)malloc(20000000*sizeof(double));

And of course, you'll need to free that memory before you exit your program (as a best practice):

free(twentyMMNumbers);
jzila
  • 724
  • 4
  • 11
  • As another answer stated, you can also use `new`. I missed that this is C++, not C, so I used `malloc`. Either will work in C++ though. – jzila Nov 22 '11 at 23:02
2

It works with smaller array size so here are some comments (since it was on CodeReview) and some fixes for bigger array sizes.

1, If you want to use fixed size arrays use a constant instead of the 200000 (or 20000000) literals.

2, Using dynamic arrays is more better. Allocate the memory after you have read the first line and use the readed size as the size of the new array. Furthermore I would store the size of the data file (the first line of the file) in a separate variable, not in the array.

int dataSize;
inputFile >> dataSize;
double *twentyMMNumbers = new double[dataSize];

It allocates the exact amount of memory. No more, no less.

It also fixes the Segmentation fault error. For more infomartion check this question: Segmentation fault on large array sizes

(Don't forget to unallocate the array with delete[].)

3, It's unnecessary to read the whole file if you have more records than the size of the array. I'd modify the while loop:

while (inputFile) {   
    if (count >= dataSize) {

        break;
    }
    inputFile >> twentyMMNumbers[count];
    count++;
}
inputFile.close();  

Maybe an exit(-1) and an error message would be better instead of the break.

4, The following comment is unneccesary:

//While loop

5, You should pass the real size of the array to the insertionSort function, so write this:

insertionSort(twentyMMNumbers, dataSize); 

The comment here is also unnecessary.

6, Improve error handling: what happens when the value of dataSize is bigger than the number of the numbers in the file?

7, I would extract a printArray function with the last for loop as well as a readInput function.

8, Consider using C++ style printing instead of printfs:

cout << "Hello world!" << endl;

(#include <iostream> is required.)

Community
  • 1
  • 1
palacsint
  • 28,416
  • 10
  • 82
  • 109
1

You have a problem in your while loop:

while(inputFile)
{   //While loop
    count++;
    if(count < 20000000)
        inputFile >> twentyMMNumbers[count];
}

This loop will only terminate if the file contains <=20000000 numbers. If count becomes 20000000, it will not try to read anymore from inputFile, but still use inputFile as the condition to continue the loop. And then when count reaches the maximum int value, it will wrap around and you will index twentyMMNumbers with a negative number. This could very well be the reason you get a segmentation fault.

20000000 is a "magic number" in your code. Instead of writing it every where make a const int NumElements = 20000000 at the beginning of your file.

Kleist
  • 7,785
  • 1
  • 26
  • 30