Is there a way to replace two similar functions that is different only by type by a single solution in this case?

Question

There's an class named PlotCurve. It describes a chart as a container of points and operations on them. A data for PlotCurve is gotten from the class RVDataProvider. Important thing is that the amount of points that is provided by RVDataProvider may be big (more than 1kk) so RVDataProvider returns a read-only pointer to Y data (X data can be calculated by index of the pointer) to improve the perfomance.

The main problem is that RVDataProvider has two different methods for two types:

class RVDataProvider : public QObject, public IRVImmutableProvider
{
public:
    // ...
    ReadonlyPointer<float>  getSignalDataFloat(int signalIndex, quint64 start, quint64 count) override;
 ReadonlyPointer<double> getSignalDataDouble(int signalIndex, quint64 start, quint64 count) override;
    // ...
}

ReadonlyPointer<T> is only a read-only wrapper of a C-style pointer.

In order to get a curve's range of values (for looking for min-max, painting them on the canvas, etc) I am supposed to declare different functions too.

class PlotCurve : public QObject
{
public:
    // ...` 
    virtual ReadonlyPointer<float> getFloatPointer(quint64 begin, quint64 length) const;
    virtual ReadonlyPointer<double> getDoublePointer(quint64 begin, quint64 length) const;  
    // ...
}

It leads to using switch statement in the client code and its changes if the new available type of data is added.

switch (dataType())
{
    case RVSignalInfo::DataType::Float: {
        auto pointer = getFloatPointer(begin, length);
        Q_ASSERT(!(pointer).isNull()); \
        for (quint64 i = 0; i < (length); ++i) { \
            auto y = (pointer)[i]; \
            if (y < (minY)) { (minY) = y; continue; } \
            if (y > (maxY)) { (maxY) = y; } \
        }
    } break;

    case RVSignalInfo::DataType::Double: {
        auto pointer = getDoublePointer(begin, length);
        Q_ASSERT(!(pointer).isNull()); \
        for (quint64 i = 0; i < (length); ++i) { \
            auto y = (pointer)[i]; \
            if (y < (minY)) { (minY) = y; continue; } \
            if (y > (maxY)) { (maxY) = y; } \
        }
    } break;

    // ...
}

Is there a way to get rid of dependencies to a client code? Three thing came to my mind:

1) Create Iterator type that would be a wrapper of ReadonlyPointer. Nope - performance is decreased to 10+ times because of iterator's virtual functions.

2) Create a traverse method that would be perform some function to every value in some range. Nope again - the most optimized version using function pointers is two times slower than switch statement in client code.

3) Make the class PlotCurve template. In this way I can't add different PlotCurves to the one container like it is now.

if you have so many points, why are you duplicating them all in floats and doubles? Should you instead be writing a `template class RVDataProvider ...`? — Caleth, Mar 18 '19 at 08:32
@Caleth RVDataProvider contains several signals with different data types so it would be wrong to use template class. I guess it would be better if there were a template member function getSignalData instead of overloading, but this isn't my class so it seems its author thought that overloading was the best in that situation. — Nikita, Mar 18 '19 at 09:38
Surely a signal has a single data type? Or are you saying that it's hidden behind `dataType()` and you can't tell till runtime? — Caleth, Mar 18 '19 at 09:48
@Caleth Yup, a signal definitely has single data type and its data is written from a file in run-time — Nikita, Mar 18 '19 at 09:50
That sounds like you want `Signal`, and plausibly some type erasure, but not at the level of individual calculations — Caleth, Mar 18 '19 at 09:52
p.s. your example is reproducing [`std::minmax_element`](https://en.cppreference.com/w/cpp/algorithm/minmax_element) — Caleth, Mar 18 '19 at 09:53
@Caleth If I got what you mean, I can't change anything related to data providing. All I have is the RVDataProvider interface. — Nikita, Mar 18 '19 at 09:55

Scheff's Cat · Accepted Answer · 2019-03-18T08:23:06.197

Unfortunately, I don't see much which can be done for OPs problem.

At best, the similar looking parts of cases could be moved to

a macro
a function template

to prevent code duplication.

For demonstration, I resembled OPs problem with the following sample code:

enum DataType { Float, Double };

struct Data {
  std::vector<float> dataFloat;
  std::vector<double> dataDouble;
  DataType type;

  Data(const std::vector<float> &data): dataFloat(data), type(Float) { }
  Data(const std::vector<double> &data): dataDouble(data), type(Double) { }
};

With a function template, processing could look like this:

namespace {

// helper function template for process()
template <typename T>
std::pair<double, double> getMinMax(const std::vector<T> &values)
{
  assert(values.size());
  double min = values[0], max = values[0];
  for (const T &value : values) {
    if (min > value) min = value;
    else if (max < value) max = value;
  }
  return std::make_pair(min, max);
}

} // namespace

void process(const Data &data)
{
  std::pair<double, double> minMax;
  switch (data.type) {
    case Float: minMax = getMinMax(data.dataFloat); break;
    case Double: minMax = getMinMax(data.dataDouble); break;
  }
  std::cout << "range: " << minMax.first << ", " << minMax.second << '\n';
}

Live Demo on coliru

With a macro it would appear even more compact:

void process(const Data &data)
{
  std::pair<double, double> minMax;
  switch (data.type) {
#define CASE(TYPE) \
    case TYPE: { \
      assert(data.data##TYPE.size()); \
      minMax.first = minMax.second = data.data##TYPE[0]; \
      for (const double value : data.data##TYPE) { \
        if (minMax.first > value) minMax.first = value; \
        else if (minMax.second < value) minMax.second = value; \
      } \
    } break
    CASE(Float);
    CASE(Double);
#undef CASE
  }
  std::cout << "range: " << minMax.first << ", " << minMax.second << '\n';
}

Live Demo on coliru

Many people (me included) consider macros in C++ as dangerous. In opposition to everything else, macros are not subject of namespaces or scopes. This can cause confusion if any identifier becomes unexpectedly subject of preprocessing. In worst case, the unintendedly modified code passes compiler and leads to unexpected behavior in runtime. (My sad experience.)

However, this is not expected in this case (assuming the code would be part of a source file).

I would've preferred a third alternative which places the repeated code inside of process(). A lambda came in my mind but lambdas can not (yet) be templated: SO: Can lambda functions be templated?.

A local template (functor) isn't alternative. It's prohibited as well: SO: Why can't templates be declared in a function?.

After feedback of OP, a note about X macros: It's an ancient technique in C to prevent redundancy of data.

A "data table" is defined where each row is a "call" of a (here not defined) macro X which contains all features.

To use the data table:

define a macro X which uses just the arguments which are needed in the individual case (and ignores the rest)
#include the data table
#undef X.

The sample again:

void process(const Data &data)
{
  std::pair<double, double> minMax;
  switch (data.type) {
#define X(TYPE_ID, TYPE) \
    case TYPE_ID: { \
      assert(data.data##TYPE_ID.size()); \
      minMax.first = minMax.second = data.data##TYPE_ID[0]; \
      for (const double value : data.data##TYPE_ID) { \
        if (minMax.first > value) minMax.first = value; \
        else if (minMax.second < value) minMax.second = value; \
      } \
    } break;
#include "Data.inc"
#undef X
  }
  std::cout << "range: " << minMax.first << ", " << minMax.second << '\n';
}

where Data.inc is:

X(Float, float)
X(Double, double)
X(Int, int)

Live Demon on coliru

Although, this macro-trickery makes a bit scary – this is very convenient concerning maintenance. If a new data type has to be added, a new X() line in Data.inc (and, of course, a re-compile) is all what's necessary. (The compiler / build chain will hopefully consider all dependencies of sources from Data.inc. We never faced problems with this in Visual Studio.)

Template functions are not suitable in this case because the most functions of `PlotCurve` are virtual. Macro version looks better, I already did the similar thing, but anyway the new `case` has to be added to the client code when the new data type will appear. — Nikita, Mar 18 '19 at 07:36

Is there a way to replace two similar functions that is different only by type by a single solution in this case?

1 Answers1