How can I implement multiple versions of the same algorithm while avoiding code duplication and conflicting names?

Question

I have developed the Insertion Sort and Quicksort algorithms in C++. Now, I intend to create at least four variations of the Quicksort algorithm. They will vary in how they choose pivot and whether they use insertion sort for small lists or not. In Java or C#, to avoid code duplication and conflicting names, I would implement each version of the Quicksort algorithm in a separate class file and use inheritance. Specifically, I would create the following classes:

QuicksortFixedPivot
QuicksortRandomPivot
QuicksortFixedPivotInsertion - subarrays with at most k elements are sorted using insertion sort
QuicksortRandomPivotInsertion

However, from my understanding "standalone" algorithms like Quicksort are usually not implemented in classes in C++.

Below is my initial implementation of Quicksort.

quicksort.hpp

#pragma once
#include <vector>

namespace algorithms 
{
   void quicksort(std::vector<int> &list);
   void quicksort(std::vector<int> &list, int left, int right);
   int partition(std::vector<int> &list, int left, int right);
}

quicksort.cpp

#include "quicksort.hpp"

namespace alg = algorithms;

void alg::quicksort(std::vector<int> &list)
{
   alg::quicksort(list, 0, list.size() - 1);
}

void alg::quicksort(std::vector<int> &list, int left, int right)
{
   if (left >= right) 
      return;

   int oldPivot = alg::partition(list, left, right);
   alg::quicksort(list, left, oldPivot - 1);
   alg::quicksort(list, oldPivot + 1, right);
}

int alg::partition(std::vector<int> &list, int left, int right)
{
   int pivot = list[left + (right-left)/2];
   while (true)
   {
      while (list[left] < pivot)
         left++;
      while (list[right] > pivot)
         right--;
      if (left >= right)
         return left;
      std::swap(list[left], list[right]);
   }
}

With the above background in mind, I have two questions.

Firstly, for being a single Quicksort implementation, have I used header files appropriately and structured my code in a good way? For example, is it good that the algorithm is outside of a class?

Secondly, how should I approach creating different versions of this algorithm while avoiding code duplication and naming conflicts? For instance, should I use classes or some other language construct?

I would appreciate if the answer included minimal code snippets that elucidate how any relevent language constructs should be used to achieve neat code.

Well, you would implement them as classes in java, but not in C++...? Just create the class interface with one `virtual int quicksort(...);` (or even maybe `class quicksort { virtual void operator()( .. ); }`) and create four classes which implement the interface. Or you can use function pointers to pass different implementation around... What is the use case? How would you want to switch between different different implementations? Why the name conflicts matter, if you wanted to create 4 classes anyway? — KamilCuk, Feb 27 '19 at 08:52
Why can't you just overload your quick-sort function to take variable parameters depeding on each of your four cases? — Inian, Feb 27 '19 at 08:55
@KamilCuk The reason I was concerned about implementing the Quciksort variations as classes is because I was told: "using a class for an algorithm is not correct in C++, just use a function instead." — mooncow, Feb 27 '19 at 08:58
I don't see a problem with using a class/object based approach although it may be somewhat overkill to have several classes to implement what is essentially the same algorithm. A single class that could switch on an enumeration of the quicksort variation would do the trick — Tom, Feb 27 '19 at 08:59
@Inian There is no need for variable parameters for solving my problem. All four implementations take the same parameters, they just differ in how they compute the pivot and whether they use insertion sort for small lists. — mooncow, Feb 27 '19 at 09:04
@mooncow, Your current approach (using namespaces and free functions) is a good one. Alternatively, you can use a class with all static methods, which is essentially the same thing. Even in Java, I'd use a class with static methods (since everything must be inside a class in Java) — dsp_user, Feb 27 '19 at 09:09
Related: [How to implement classic sorting algorithms in modern C++?](https://stackoverflow.com/questions/24650626/how-to-implement-classic-sorting-algorithms-in-modern-c) — Caleth, Feb 27 '19 at 09:40

Quimby · Accepted Answer · 2019-02-27T09:25:58.837

You can do it similarly how std does that with e.g. execution policy for algorithms. It uses tagging which can be done easily with structs:

#include <iostream>

struct versionA{};
struct versionB{};

int fooA(versionA tag, int param)
{
    (void)tag;//Silences "unused parameter" warning
    return 2*param;
}

int fooB(versionB tag,int param)
{
    (void)tag;
    return 5*param;
}

int main()
{
    std::cout<<fooA(versionA{},5)<<'\n';
    std::cout<<fooB(versionB{},5)<<'\n';
//Outputs:
//10
//25
}

Compiler can optimize out these empty unused structs so there is no harm in this. Alternative would be to use templates with the template parameter being the tag type and fully specializing them for individual versions. But this approach has few disadvantages - leaks implementation to header files and template function cannot be partially specialized, which wouldn't play well if the algorithms itself need template parameters. And templates mixed with overloads might not always lead to calling the expected function.

If the {} in the function calls bothers you, you can make global instances of those structs and pass them instead(by copy).

To answer you first question: Yes, you used them correctly. Very small note - #pragma once is not standard C++, but all standard compilers support it anyway. Proper alternative is to use include guards.

Full example with tagging:

// header file
#include <vector>

namespace algorithms 
{
namespace ver
{

    struct FixedPivot_tag{};
    struct RandomPivot_tag{};

    const extern FixedPivot_tag FixedPivot;
    const extern RandomPivot_tag RandomPivot;
}

void quicksort(ver::FixedPivot_tag tag, std::vector<int> &list, int left, int right);
void quicksort(ver::RandomPivot_tag tag, std::vector<int> &list, int left, int right);
}
// cpp file
namespace algorithms
{
    namespace ver
    {
        constexpr const FixedPivot_tag FixedPivot{};
        constexpr const RandomPivot_tag RandomPivot{};
    }

void quicksort(ver::FixedPivot_tag tag, std::vector<int> &list, int left, int right)
{
    (void)tag;
    //...
}
void quicksort(ver::RandomPivot_tag tag, std::vector<int> &list, int left, int right)
{
    (void)tag;
    //...
}
}
// usage
int main()
{
    std::vector <int> vector{5,4,3,2,1};
    using namespace algorithms;
    quicksort(ver::FixedPivot,vector,0,4);
    quicksort(ver::RandomPivot,vector,0,4);
}

Based on your experience, what advantages/disadvantages does this approach have to creating pure virtual functions and four classes that implement them? — mooncow, Feb 27 '19 at 09:34
@mooncow It's much harder for the compiler to optimise through virtual functions if you are actually using them virtually. Either way you are passing around an extra parameter, although tags force the whole dependency chain to be templated — Caleth, Feb 27 '19 at 09:38
@Caleth So in a situation like mine, it would generally be more efficient to use tagging over virtual functions? — mooncow, Feb 27 '19 at 09:40
@mooncow it might be that you are choosing between `RandomPivot{}.quicksort(args...)` and `quicksort(RandomPivot{}, args...)`. In that case there is (probably) no difference — Caleth, Feb 27 '19 at 09:43
@mooncow It's compile time - you know exactly what function will be called. That is also a disadvantage, there is no way, to decide that during run-time apart from `if-else`. Its simple in a sense that they are what they are meant to be - functions. I personally feel like I shouldn't have to make an empty instance of a class to call a function that doesn't need that instance at all. But that said it can absolutely be implemented using virtuals without any real disadvantages. Also a third option is to use an enum and switch inside. — Quimby, Feb 27 '19 at 09:45
Virtual call overhead is not noticeable until you make 1000+ calls per second (probably not likely) and even then the call time will be nothing compared to duration of whole quicksort algorithm. Yes, it won't be 100% efficient but it won't matter at all. I would strongly prefer design here over performance. — Quimby, Feb 27 '19 at 09:49

Handy999 · Answer 2 · 2019-02-27T14:14:29.013

If you think that there could be different parameters (like pivot and insertion) which can be chosen independently, then you should consider enums (or better enum classes).

You could pass the information either as a parameter (with runtime dispatch using a standard if) or as a template parameter (with compile-time dispatch using if constexpr). The code would be as follows:

namespace alg {

enum class pivot { first=0; last=1; middle=2};
enum class insertion { off=0; on=1 };

template <pivot p, insertion i>
int partition(std::vector<int> &list, int left, int right)
{
    int pivot;
    if constexpr (p==pivot::first)
        pivot = list[left];
    else if constexpr (p==pivot::last)
        pivot = list[right];
    else // if constexpr (p==pivot::first)
        pivot = list[left + (right-left)/2];

    while (true)
    {
        while (list[left] < pivot)
            left++;
        while (list[right] > pivot)
            right--;
        if (left >= right)
            return left;
        std::swap(list[left], list[right]);
    }
}

template <pivot p, insertion i>
void quicksort_section( std::vector<int>& list, int begin, int end )
{
   if (left >= right) 
       return;

   int oldPivot = partition(list, left, right);
   quicksort_section(list, left, oldPivot - 1);
   quicksort_section(list, oldPivot + 1, right);
}

template <pivot p, insertion i>
void quicksort(std::vector<int> &list)
{
    quicksort_section<p,i>(list, 0, list.size() - 1);
}

} // namespace alg

Note that templates are typically implemented in the header files.

This solves all your need. It is extendable and it avoids code duplication, i.e., you do not have to treat all N_pivot * N_insertion possibilities seperately.

If you use an old C++ version (pre C++17), you could just remove the constexpr (using some macro trick) because any reasonable compiler would eliminate occurring dead code.

This sounds like it might be a good idea, but I do not understand how to set it up. Do you know where I can read about it or could you provide more detail? — mooncow, Feb 27 '19 at 13:55
I do not know exactly what the question is about. Do you want information about how to set up templates? Or about if constexpr? — Handy999, Feb 27 '19 at 14:07

How can I implement multiple versions of the same algorithm while avoiding code duplication and conflicting names?

2 Answers2