quicksort with random data causes N^2 complexity graph

Question

I am trying to plot the theoretical time complexity with measured values for quicksort. The plots for quicksort look good when I input increasing, decreasing or constant data series (they follow N^2 nicely). The problem arises when I enter a random input stream.

For quicksort given random data, the time complexity is n log n, but my data points suggest that it's N^2. I have tried applying different implementations of the partitioning, and I've made sure that the elements get sorted.

Here is my implementation of quicksort and the partition step:

template<typename BidirectionalIterator, typename Compare>
BidirectionalIterator partition_impl(BidirectionalIterator first,
BidirectionalIterator last, Compare cmp) {
    auto pivot = std::prev(last, 1);
    if (first == pivot) return first;
    while (first != last) {
        while (cmp(*first, *pivot)) { ++first; } // search greater > pivot
        do { --last; } while (!cmp(*last, *pivot) && last >= first); // search < pivot
        if (first > last) {     // if left iterator has surpassed right iterator
            std::iter_swap(first, pivot);
            return first;
        }
        if (*first > *last) {
            std::iter_swap(first, last);
            ++first;
        }
    }
    std::iter_swap(first, pivot);
    return first;
}

template <typename FwdIt, typename Comp = std::less<>>
void quick_sort(FwdIt first, FwdIt last, Comp cmp = Comp{}) {
    if (std::distance(first, last) <= 1) return;

    FwdIt split = partition_impl(first, last, cmp);

    quick_sort(first, split, cmp);  // sorts the left partition
    quick_sort(split + 1, last, cmp);   // sorts the right partition
}

So my question: is there anything in my implementation that indicates what the problem may be?

This is what my graph looks like:

EDIT: Some additional code:

void measure(std::vector<int>& ds) {
    Timer::start();
    algo_ptr(ds.begin(), ds.end(), std::less<>());
    Timer::stop();
}
void collectSamples(data_series_t& data_series) {
    for (int k = 0; k < NUM_OF_SAMPLES; ++k) {
        auto ds = std::get<1>(data_series)(x_measurement_point);
        measure(ds);
        measurements.push_back(Timer::elapsedTime());
    }
}

The algo_ptr goes directly to my implementation above. For every data series, I collect 10 samples, on each iteration I make sure to generate a new random data series such that I don't sort already sorted elements.

Your green line doesn’t look like an N log N curve. Why don’t you plot time taken against N log N, instead of against N? BTW, 100000 isn’t that large. — lastchance, May 05 '23 at 07:07
Are you using compiler optimisations? 25 seconds seems pretty slow to sort 100,000 elements. How are you measuring the execution time? Are you including the time to create the data? — Alan Birtles, May 05 '23 at 07:13
I'm fairly sure it is n log n, I know that it looks linear but from what I understand linear and log n aren't dramatically different in their curves. I use the follow function to get the logarithmic function (used in python for plotting) `def logarithmic(N, k): return k * N * np.log2(N)` — proxylite, May 05 '23 at 07:15
@AlanBirtles I am running in release mode, so there is no extra delay. No, I am not including the time to create the data, only the algorithm above is timed. — proxylite, May 05 '23 at 07:19
https://en.wikipedia.org/wiki/Quicksort see section "Worst-case performance". — Marek R, May 05 '23 at 07:20
Your measurements (for smaller N) might be distorted due to CACHE you could try to **flush DATA and Instruction CACHEs** before each quick sort measurement ... see [flush CACHE from C++](https://stackoverflow.com/a/21517601/2521214) — Spektre, May 05 '23 at 07:20
@lastchance well, for every data series I take 10 samples of increasing number of input elements. This is a school assignment and that's how we were supposed to do it. Also, I have tried doing this for up to a million element, and the result is the same. Doesn't seem to matter what the input size is. — proxylite, May 05 '23 at 07:20
@Spektre Hmm I haven't heard of that being a potential problem, but I might look into it. — proxylite, May 05 '23 at 07:26
@MarekR Yes, I know that it will be worst case if pivot is largest or smallest number, but it seems unlikely that this should become worst-case given that the pivot chosen is essentially at random. I pick the last element as pivot, and the data is already random. — proxylite, May 05 '23 at 07:27
@proxylite it simply greatly affect computational speed for low N so the measured curve might not be related to quicksort at all, once you measure data chunks bigger than CACHE size it no longer matters — Spektre, May 05 '23 at 07:27
Measuring such small times is no goot to see the complexity because inaccuracy in measuring so small timings. I recommend to use much bigger n to to get at least timings in the two digit seconds range. — MrSmith42, May 05 '23 at 07:29
@proxylite btw this might interest you [Measuring/estimating complexity](https://stackoverflow.com/a/64784308/2521214) — Spektre, May 05 '23 at 07:35
You seem to do repeated tests per sample size. Make sure you do not run the repetitions on the already sorted data, because then you get the worst-case behavior. — j6t, May 05 '23 at 07:36
@MrSmith42 I have tried with larger input size, and there was no difference. Now, I could try to go for millions of input elements, but that will take some time to execute. But I can try it. — proxylite, May 05 '23 at 07:43
@j6t, I updated my question to add code that shows that I don't repeat sorted data — proxylite, May 05 '23 at 07:44
@proxylite Compare your version to the one found [here](https://stackoverflow.com/questions/24650626/how-to-implement-classic-sorting-algorithms-in-modern-c). — PaulMcKenzie, May 05 '23 at 08:07
As far as I can tell, the `do { --last; } while (!cmp(*last, *pivot) && last >= first);` may go out-of-bounds. — molbdnilo, May 05 '23 at 08:09
what is `Timer` ? Without a [mcve] its not possible to explain the output in details — 463035818_is_not_an_ai, May 05 '23 at 09:08
Time complexity is a blunt instrument, it describes behaviour as values go to infinity. As such you *can't* measure it, you can only calculate it — Caleth, May 05 '23 at 10:07
`while (cmp(*first, *pivot)) { ++first; }` can go out of bounds, too — Matt Timmermans, May 05 '23 at 11:50
always plot [at log-log scale](http://en.wikipedia.org/wiki/Analysis_of_algorithms#Empirical_orders_of_growth) in such cases, to be sure. then any power law will present as a straight line, with different slopes. you can also plot t/N vs N (still log-log) graphs for an even clearer picture. — Will Ness, May 05 '23 at 16:51

score 2 · Accepted Answer · answered May 05 '23 at 14:56

Thanks everyone for trying to help. I really appreciate it. I think I have found the solution.

The problem was that I had only allowed 50 different random values to occur. This is way too small of a sample size of random values, since I am measuring much more elements than that.

This essentially caused the pivot to not be randomly chosen, which made my quick sort algorithm deteriorate into N^2.

In hindsight this seems like such an obvious thing I should have realized, but I am still a beginner with DSA.

After fixing this and making sure that I have several million different possible random values, the graph looks like this:

three-way partitioning wouldn't have this problem on that input, because then the *equals* part needs not be sorted. would also be *much* faster on such cases (with many many repeated elements shuffled around) than the usual 2-way partitioning. — Will Ness, May 05 '23 at 17:00

score 0 · Answer 2 · answered May 05 '23 at 14:13

0

First, there is a bug in partition_impl that can cause a fault:

do { --last; } while (!cmp(*last, *pivot) && last >= first);

The last >= first comparison needs to be done before the cmp call to protect from calling it with an out-of-range iterator. Interestingly, this caused a fault when running on arm64, but not on x86.

Second, here are performance comparisons between std::sort and quick_sort. The first graph shows the elapsed time when sorting from 1M to 20M rows and the second graph shows the number of comparisons required for each algorithm.

I suspect your algorithm has O(n log n) behavior, but the library version is doing a better job of choosing a pivot (I think it uses the median of three approach), which results in a speed improvement that is probably just a constant factor.

answered May 05 '23 at 14:13

RandomBits

4,194
1
17
30

1

How did you get their data so you were able to do equivalent testing? – Kelly Bundy May 05 '23 at 14:37
I generated random data -- so it is probably not the same exact data as OP, but it should behave the same. – RandomBits May 05 '23 at 14:56
1

Thank you @RandomBits. I know that the while condition can go out of bounds. I deleted that condition from my algorithm and it seems to work just the same. In any case, I have posted my own answer because I think I have solved the problem.. – proxylite May 05 '23 at 15:00
@proxylite You are welcome. However, you have multiple indexing bugs in the code you posted with or without the condition. The problem can be demonstrated if you try to sort the vector { 2, 1 }. In the partition function, the line in question will compare the elements (1, 1) and then (0, 1) and then (-1, 1) the last of which is wrong. Depending on the vagaries of your platform, this may only rarely be exposed as a problem, but the code is not correct. – RandomBits May 05 '23 at 15:32
1

@greybeard In my opinion it doesn't answer the question. Doesn't explain the apparently quadratic behaviour, doesn't even reproduce it. – Kelly Bundy May 05 '23 at 17:33
@KellyBundy Correct, I do not reproduce the quadratic behavior. indeed, I state and show that there is not any quadratic behavior for random data. And, as it turns out, one of the issues was the OP wasn't actually using the full range of random data. To me, that seems pretty inline with answering the question. – RandomBits May 05 '23 at 17:44
1

"I can't reproduce it" isn't an answer. – Kelly Bundy May 05 '23 at 18:00
@KellyBundy I agree with your statement, but don't think that is a fair representation of my answer. I gave a detailed explanation of how I got a different result using the code he posted and following the method he described. I think that was the most constructive feedback possible given the actual problem was in code that was not posted. – RandomBits May 05 '23 at 18:24
I think the most (or rather only) constructive feedback possible was to ask for an MRE (and then wait for them to provide it). – Kelly Bundy May 05 '23 at 18:38
In a lot of cases, asking for an MRE is necessary and appropriate. When I started investigating this question, I was working under the assumption that the OP had provided an MRE and I presume the OP thought the same -- he gave a very reasonable description of the problem details. Only after I had done some significant analysis and made a slight tweak to the code, did it become clear that I got a different result than the OP. At that point in the process, it would have been unhelpful and even arrogant to simply proclaim "can't reproduce" and ask for an MRE -- the problem was too complex. – RandomBits May 05 '23 at 19:22
There's nothing unhelpful/arrogant about that. And they obviously hadn't provided an MRE. How do you copy&paste&run their stuff and get results instead of a compile error? – Kelly Bundy May 05 '23 at 19:36
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/253503/discussion-between-randombits-and-kelly-bundy). – RandomBits May 05 '23 at 20:25

quicksort with random data causes N^2 complexity graph

2 Answers2