cs50 pset3 sort function

Question

I am having trouble with implementing the sort function on pset3. I have used the GDB and found that my sort function does not sort anything. I am not sure if there is a syntax issue, or if the logic is a bit screwed up.

void sort(int values[], int n)
{
    for (int k = 0; k < n; k++)
    {
        for (int j = 0; j < n; j++)
        {
            if (values[k] >= values[j])
            {
                 int temp = values[k];
                 values[k] = values[j];
                 values[j] = temp;
            }
        }
    }
}

If you had a 'syntax issue', the compiler would reject the code. If you're running it in GDB, then the problem is not the syntax but the logic. — Jonathan Leffler, Nov 23 '16 at 18:00
It's not particularly efficient, but the code sorts in descending order for me on arrays of size 3 and 5. (Why not efficient? Well, amongst other things, it compares a number with itself, and if the number is equal to itself, swaps the value with itself, which doesn't tend to change things very much.) — Jonathan Leffler, Nov 23 '16 at 18:07

score 1 · Accepted Answer · edited May 23 '17 at 10:30

1

You are close, but your loops are not quite right - change:

for (int k = 0; k < n; k++)
{
    for (int j = 0; j < n; j++)
    {

to:

for (int k = 0; k < n - 1; k++)
{
    for (int j = k + 1; j < n; j++)
    {

To understand why you need to make this change, consider that the inner loop (j) need only compare elements above index k with the current element at index k. So the outer loop (k) needs to iterate from 0 to n - 2 (one less than the last element), and for each outer loop iteration the inner loop needs to iterate from k + 1 (first element above k) to n - 1 (the last element).

NOTE: by pure chance it seems that the original code does appear to work correctly, even though it appears at first glance that it shouldn't. I have tested it with various edge cases and even though it performs many redundant swaps, the final result always seems to be sorted (suprisingly though the output is in descending order whereas the fixed code generates results in ascending order, as expected). Credit to Jonathan Leffler for spotting this - see his answer and demo program.

One other minor point -- this test:

        if (values[k] >= values[j])

should really just be:

        if (values[k] > values[j])

It's not incorrect as it stands (the code will still work), but there is no point in swapping elements that are equal, so it's somewhat inefficient as written.

edited May 23 '17 at 10:30

Community

1
1

answered Nov 23 '16 at 17:36

Paul R

208,748
37
389
560

1

I think your rewrite of the loops is an efficiency issue, not a technical necessity. At least, for the (not yet very stringent) tests I've run, the code in the question works as is. Similarly with the comparisons. – Jonathan Leffler Nov 23 '16 at 18:08
@JonathanLeffler: are you sure ? With the original loops elements will be swapped when j < k and when j > k, so the final ordering should still be somewhat random. It may just be that you got lucky with whatever input data you used ? – Paul R Nov 23 '16 at 18:46
1

Well, I've run it on arrays of size 3, 5, 9, with 5 random permutations of the array of size 5 and 9 random permutations of the array of size 9 and the data was sorted each time. The code keeps on comparing already sorted data, but that's not a problem except for performance. – Jonathan Leffler Nov 23 '16 at 18:49
That's very odd. Do you see what I mean about testing and swapping when j < k and j > k ? Surely half of those swaps should be incorrect ? – Paul R Nov 23 '16 at 18:53
1

I've not (yet) instrumented at the 'per swap' level — it wouldn't be hard to do that, though I'd want to upgrade the array print code to take a 'tag' string to identify when it is called, and I'd ensure that tag identified what swap occurred. I make zero claims that the sort is efficient — it isn't. Empirically though, it seems to end up working. I'd be curious to see your counter-demo. – Jonathan Leffler Nov 23 '16 at 19:01
1

Yes, there is unnecessary swapping going on, but the net result is still correctly sorted. Here's one shorter trace (using `SWP` instead of `Swap` gave nicely aligned output on my screen — not so much in a comment): `Seed: 753593906` — `Before 5: 6 0 5 2 1` — `SWP(0,0): 6 0 5 2 1` — `SWP(0,1): 0 6 5 2 1` — `SWP(1,0): 6 0 5 2 1` — `SWP(1,1): 6 0 5 2 1` — `SWP(2,1): 6 5 0 2 1` — `SWP(2,2): 6 5 0 2 1` — `SWP(3,2): 6 5 2 0 1` — `SWP(3,3): 6 5 2 0 1` — `SWP(4,3): 6 5 2 1 0` — `SWP(4,4): 6 5 2 1 0` — `After 5: 6 5 2 1 0`. Yes, the sort is sub-optimal. It does work, though. – Jonathan Leffler Nov 23 '16 at 19:12
@JonathanLeffler: wow, that's really interesting - I think this would make a great teaching example, i.e. how you can implement an algorithm incorrectly, test it and it appears to work, and yet underneath it's not correct (or is at least inefficient). I'll have a play and see if I can find any edge cases where it fails. I'd also like to understand why it actually appears to work too, as it seems like it really shouldn't, at least from a cursory analysis. – Paul R Nov 23 '16 at 19:42
@JonathanLeffler: I've spent some time testing various edge cases now, and I have yet to find anything which causes the original code to fail. Note however that it does produce results in descending order, for some reason, whereas the fixed code produces output in ascending order. I still don't fully understand why it works, but it makes for an interesting novelty. – Paul R Nov 24 '16 at 08:21

score 1 · Answer 2 · edited May 23 '17 at 12:26

I took your code and converted into a complete program. It's larger than an MCVE because it has support code for shuffling arrays, and for printing results, as well as a main() that exercises these, of course.

#include <stdio.h>
#include <stdlib.h>

static int rand_int(int n)
{
    int limit = RAND_MAX - RAND_MAX % n;
    int rnd;

    while ((rnd = rand()) >= limit)
        ;
    return rnd % n;
}

static void shuffle(int *array, int n)
{
    for (int i = n - 1; i > 0; i--)
    {
        int j = rand_int(i + 1);
        int tmp = array[j];
        array[j] = array[i];
        array[i] = tmp;
    }
}

static void print_array(int n, int a[n])
{
    for (int i = 0; i < n; i++)
        printf(" %d", a[i]);
    putchar('\n');
}

static void sort(int values[], int n)
{
    for (int k = 0; k < n; k++)
    {
        for (int j = 0; j < n; j++)
        {
            if (values[k] >= values[j])
            {
                 int temp = values[k];
                 values[k] = values[j];
                 values[j] = temp;
            }
        }
    }
}

int main(int argc, char **argv)
{
    if (argc > 1)
    {
        long l = strtol(argv[1], 0, 0);
        unsigned u = (unsigned)l;
        printf("Seed: %u\n", u);
        srand(u);
    }

    int data3[3] = { 3, 1, 2 };
    print_array(3, data3);
    sort(data3, 3);
    print_array(3, data3);

    int data5[5] = { 0, 2, 6, 1, 5, };
    for (int i = 0; i < 5; i++)
    {
        shuffle(data5, 5);
        print_array(5, data5);
        sort(data5, 5);
        print_array(5, data5);
    }

    int data9[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
    for (int i = 0; i < 9; i++)
    {
        shuffle(data9, 9);
        print_array(9, data9);
        sort(data9, 9);
        print_array(9, data9);
    }

    return 0;
}

The shuffle code implements a Fisher-Yates shuffle, and is based on code from an answer by Roland Illig. If invoked without a seed argument, it generates the same output each time.

Code compiled and tested on macOS Sierra 10.12.1 with GCC 6.2.0.

An example output:

Seed: 123456789
 3 1 2
 3 2 1
 6 0 1 5 2
 6 5 2 1 0
 0 6 1 2 5
 6 5 2 1 0
 0 1 2 6 5
 6 5 2 1 0
 5 0 6 1 2
 6 5 2 1 0
 1 6 5 2 0
 6 5 2 1 0
 0 4 8 3 7 5 1 6 2
 8 7 6 5 4 3 2 1 0
 7 4 0 5 6 8 3 2 1
 8 7 6 5 4 3 2 1 0
 1 2 7 5 0 8 3 6 4
 8 7 6 5 4 3 2 1 0
 3 8 7 5 2 1 0 6 4
 8 7 6 5 4 3 2 1 0
 1 4 2 6 3 0 7 5 8
 8 7 6 5 4 3 2 1 0
 2 3 7 4 8 0 5 6 1
 8 7 6 5 4 3 2 1 0
 3 4 5 8 6 2 0 7 1
 8 7 6 5 4 3 2 1 0
 3 6 7 4 8 2 5 1 0
 8 7 6 5 4 3 2 1 0
 0 8 7 3 4 6 5 1 2
 8 7 6 5 4 3 2 1 0

This shows the data being sorted in descending order every time, despite different randomized inputs.

cs50 pset3 sort function

2 Answers2