Method that accepts both float[] and double[] arrays

Question

I have a method that accepts an array (float or double), start and end index and then does some element manipulations for indexes in startIndex to endIndex range.

Basically it looks like this:

public void Update(float[] arr, int startIndex, int endIndex)
{
   if (condition1)
   {
     //Do some array manipulation
   }
   else if (condition2)
   {
     //Do some array manipulation
   }
   else if (condition3)
   {
       if (subcondition1)
       {
         //Do some array manipulation
       }
   }
}

Method is longer than this, and involves setting some elements to 0 or 1, or normalizing the array. The problem is that I need to pass both float[] and double[] arrays there, and don't want to have a duplicated code that accepts double[] instead.

Performance is also critical, so I don't want to create a new double[] array, cast float array to it, perform calcs, then update original array by casting back to floats.

Is there any solution to it that avoids duplicated code, but is also as fast as possible?

I suggest you create the `double[]` version, make it as fast as necessary, then see how much slower it is with `float[]`. There may not be a significant difference in performance. In the meantime, you will have spent your time making the `double[]` version faster - maybe fast enough to overcome any performance decrease from converting from `float[]` to `double[]`. — John Saunders, Dec 30 '14 at 00:18
@JohnSaunders Method needs to accept both `float[]` and `double[]` arrays, and `Update(double[] array)` does not accept `float[]` as an argument. — Michal, Dec 30 '14 at 00:24
I didn't say anything about normalizing the array. But you can try simply converting the `float[]` to `double[]` inline. — John Saunders, Dec 30 '14 at 00:25
@IllidanS4 Pseudocode: `sum = arr.Sum(startIndex, endIndex) ;foreach(i in startIndex..endIndex) arr[i] /= sum;` — Michal, Dec 30 '14 at 00:27
@Michal I meant that in your question. If you are just setting elements to 0 or 1, it's fairly easy to make it generic. Edit: Ah, this may be harder. — IS4, Dec 30 '14 at 00:27
@IllidanS4: you won't be able to do it with generics. Generics and numerics don't mix well. You will only be able to do operations which would work on type `object`. — John Saunders, Dec 30 '14 at 00:29
The most straightforward solution is to use a C++ template to do the actual work, and a C++/CLI `ref class` to make it easy to use from C#. Unlike generics, templates work great with operator overloads and other operations dependent on duck typing. — Ben Voigt, Dec 30 '14 at 00:45
There is no good performant way to accomplish this in C#. @BenVoigt's suggestion may end up giving you a solution with the least amount of duplicated code. — Cory Nelson, Dec 30 '14 at 01:12

Joel Coehoorn · Answer 1 · 2014-12-30T01:43:53.533

You have a few options. None of them match exactly what you want, but depending on what kind of operations you need you might get close.

The first is to use a generic method where the generic type is restricted, but the only operations you can do are limited:

public void Update<T>(T[] arr, int startIndex, int endIndex) : IComarable
{
   if (condition1)
   {
     //Do some array manipulation
   }
   else if (condition2)
   {
     //Do some array manipulation
   }
   else if (condition3)
   {
       if (subcondition1)
       {
         //Do some array manipulation
       }
   }
}

And the conditions and array manipulation in that function would be limited to expressions that use the following forms:

if (arr[Index].CompareTo(arr[OtherIndex])>0)
arr[Index] = arr[OtherIndex];

This is enough to do things like find the minimum, or maximum, or sort the items in the array. It can't do addition/subtraction/etc, so this couldn't, say, find the average. You can make up for this by creating your own overloaded delegates for any additional methods you need:

public void Update<T>(T[] arr, int startIndex, int endIndex, Func<T,T> Add) : IComarable
{
   //...
   arr[Index] = Add(arr[OtherIndex] + arr[ThirdIndex]);
}

You'd need another argument for each operation that you actually use, and I don't know how that will perform (that last part's gonna be a theme here: I haven't benchmarked any of this, but performance seems to be critical for this question).

Another option that came to mind is the dynamic type:

public void Update(dynamic[] arr, int startIndex, int endIndex)
{
     //...logic here
}

This should work, but for something called over and over like you claim I don't know what it would do to the performance.

You can combine this option with another answer (now deleted) to give back some type safety:

public void Update(float[] arr, int startIndex, int endIndex)
{
    InternalUpdate(arr, startIndex, endIndex);
}
public void Update(double[] arr, int startIndex, int endIndex)
{
    InternalUpdate(arr, startIndex, endIndex);
}
public void InternalUpdate(dynamic[] arr, int startIndex, int endIndex)
{
     //...logic here
}

One other idea is to cast all the floats to doubles:

public void Update(float[] arr, int startIndex, int endIndex)
{
    Update( Array.ConvertAll(arr, x => (double)x), startIndex, endIndex);
}

public void Update(double[] arr, int startIndex, int endIndex)
{
   //...logic here
}

Again, this will re-allocate the array, and so if that causes a performance issue we'll have to look elsewhere.

If (and only if) all else fails, and a profiler shows that this is a critical performance section of your code, you can just overload the method and implement the logic twice. It's not ideal from a code maintenance standpoint, but if the performance concern is well-established and documented, it can be the worth the copy pasta headache. I included a sample comment to indicate how you might want to document this:

/******************
   WARNING: Profiler tests conducted on 12/29/2014 showed that this is a critical
            performance section of the code, and that separate double/float
            implementations of this method produced a XX% speed increase.
            If you need to change anything in here, be sure to change BOTH SETS,
            and be sure to profile both before and after, to be sure you
            don't introduce a new performance bottleneck. */

public void Update(float[] arr, int startIndex, int endIndex)
{
    //...logic here
}

public void Update(double[] arr, int startIndex, int endIndex)
{
    //...logic here
}

One final item to explore here, is that C# includes a generic ArraySegment<T> type, that you may find useful for this.

Your points are valid, but using dynamic keyword will cause performance issues, which is unfortunately completely unacceptable in my program. — Michal, Dec 30 '14 at 00:59
I was working on some edits. Hopefully something in there works out. Also, I know there's a way to do this via delegates and type inference. I just haven't been able to make it work yet. — Joel Coehoorn, Dec 30 '14 at 01:03
@BenVoigt I hadn't actually read that comment, but of course you're right. — Joel Coehoorn, Dec 30 '14 at 01:43

Erti-Chris Eelmaa · Answer 2 · 2014-12-30T01:48:50.727

Just an idea. I have no idea what the performance implications are, but this helped me to go to sleep :P

public void HardcoreWork(double[] arr){HardcoreWork(arr, null);}
public void HardcoreWork(float[] arr){HardcoreWork(null, arr);}

public struct DoubleFloatWrapper
{
    private readonly double[] _arr1;
    private readonly float[] _arr2;

    private readonly bool _useFirstArr;

    public double this[int index]
    {
        get {
            return _useFirstArr ? _arr1[index] : _arr2[index];
        }
    }

    public int Length
    {
        get {
            return _useFirstArr ? _arr1.Length : _arr2.Length;
        }
    }

    public DoubleFloatWrapper(double[] arr1, float[] arr2)
    {
        _arr1 = arr1;
        _arr2 = arr2;
        _useFirstArr = _arr1 != null;
    }
}

private void HardcoreWork(double[] arr1, float[] arr2){

    var doubleFloatArr = new DoubleFloatWrapper(arr1, arr2);
    var len = doubleFloatArr.Length;

    double sum = 0;

    for(var i = 0; i < len; i++){
        sum += doubleFloatArr[i];
    }
}

Don't forget that if the amount of elements you have is ridiculously small, you can just use pooled memory, which will give you zero memory overhead.

ThreadLocal<double[]> _memoryPool = new ThreadLocal<double[]>(() => new double[100]);

private void HardcoreWork(double[] arr1, float[] arr2){

    double[] array = arr1;
    int arrayLength = arr1 != null ? arr1.Length : arr2.Length;

    if(array == null)
    {
        array = _memoryPool.Value;

        for(var i = 0; i < arr2.Length; i++)
            array[i] = arr2[i];
    }


    for(var i = 0; i < 1000000; i++){
        for(var k =0; k < arrayLength; k++){
            var a = array[k] + 1;
        }
    }
}

I'm pretty sure performance will be atrocious. You've added conditionals evaluated *inside* the loop, as well as forced runtime bounds checking. — Ben Voigt, Dec 30 '14 at 01:17

Quality Catalyst · Answer 3 · 2014-12-30T01:17:54.027

2

What about implementing the method using generics? An abstract base class can be created for your core business logic:

abstract class MyClass<T>
{
    public void Update(T[] arr, int startIndex, int endIndex)
    {
        if (condition1)
        {
            //Do some array manipulation, such as add operation:
            T addOperationResult = Add(arr[0], arr[1]);
        }
        else if (condition2)
        {
            //Do some array manipulation
        }
        else if (condition3)
        {
            if (subcondition1)
            {
                //Do some array manipulation
            }
        }
    }

    protected abstract T Add(T x, T y);
}

Then implement per data type an inheriting class tuned to type-specific operations:

class FloatClass : MyClass<float>
{
    protected override float Add(float x, float y)
    {
        return x + y;
    }
}

class DoubleClass : MyClass<double>
{
    protected override double Add(double x, double y)
    {
        return x + y;
    }
}

edited Dec 30 '14 at 01:17

answered Dec 30 '14 at 00:20

Quality Catalyst

6,531
8
38
62

3

Does not work for arr[0] = arr[1]+arr[2] and similar. I wish I could write `Update(...) where T: INumeric` or something like this – Michal Dec 30 '14 at 00:24
If you define T in the class that contains your Update method you can restrict T to any interface you want: class MyClass where T : IMyInterface { ... public void Update(T[] arr, int startIndex, int endIndex) { // do your stuff here } } – Quality Catalyst Dec 30 '14 at 00:39
Alternatively, create an abstract base class and define two inheriting classes one that defines T to be float, the other one defines T to be double. Specific operations such as adding array values can be implemented in the inheriting classes. – Quality Catalyst Dec 30 '14 at 00:43
@JohnSaunders: Sure I have. Example enhanced. – Quality Catalyst Dec 30 '14 at 01:18
All those virtual calls would be horrendously slow if not for branch prediction. As it is, they'll still be slow. – Ben Voigt Dec 30 '14 at 01:27
@BenVoigt: Don't understand this comment. Virtual calls? After compilation, in the IL you will find two separate classes which are type-safe (either for float or double). The only trade-off is that if an operator needs to be called a method call is needed (memory and performance overhead). Here's a decision to make whether or not redundant-free code or higher performance is more important. – Quality Catalyst Dec 30 '14 at 01:33
@QualityCatalyst: Yes, virtual calls. You do know that the `abstract` and `override` keywords make a method virtual, right? – Ben Voigt Dec 30 '14 at 01:38
@BenVoigt: check this post (slightly different scenario): http://stackoverflow.com/questions/530799/what-are-the-performance-implications-of-marking-methods-properties-as-virtual The interesting fact is what the CLR does behind the scenes and apparently, there isn't a significant difference thanks to compilers optimizing the code. – Quality Catalyst Dec 30 '14 at 01:48
@QualityCatalyst: From the answer you linked: **The real overhead comes in that virtual functions can't be inlined.** (actually this idea is repeated in four or five of the answers) – Ben Voigt Dec 30 '14 at 01:50
@BenVoigt: two register moves per call. There is somewhere mentioned the method is invoked only 20 million times. Not sure what the business logic does and how often the overridden methods need to be called; however, assuming three operations we talk 60 million register moves per second. That's the performance trade-off for the redundant-free coding. – Quality Catalyst Dec 30 '14 at 01:57
You're comparing the cost of the indirect call to a direct call. Both are immensely slower than just doing the arithmetic inside the function. Because of this, the compiler will remove the direct call (inlining), but cannot do that for the virtual call, which is indirect. So the performance delta ends up being much much bigger. – Ben Voigt Dec 30 '14 at 01:59
1

Also, the statistic was 20 million calls to the *outer* method per second. From the comments, there's at least several dozen uses of arithmetic operations in each one of those, which is going to result in hundreds of millions or even billions of calls to your helper functions per second. That will hurt. – Ben Voigt Dec 30 '14 at 02:01

score 2 · Answer 4 · answered Dec 30 '14 at 01:37

John's comment about macros, although completely inaccurate characterization of C++ templates, got me thinking about the preprocessor.

C#'s preprocessor is nowhere near as powerful as C's (which C++ inherits), but it still is able to handle everything you need except the duplication itself:

partial class MyClass
{
#if FOR_FLOAT
    using Double = System.Single;
#endif

    public void Update(Double[] arr, int startIndex, int endIndex)
    {
        // do whatever you want, using Double where you want the type to change, and
        // either System.Double or double where you don't
    }
}

Now, you need to include two copies of the file in your project, one of which has an extra

#define FOR_FLOAT

line at the top. (Should be fairly easy to automate adding that)

Unfortunately, the /define compiler option applies to the entire assembly, not per-file, so you can't use a hardlink to include the file twice and have the symbol only defined for one. However, if you can tolerate the two implementations being in different assemblies, you can include the same source file into both, using the project options to define FOR_FLOAT in one of them.

I still advocate using templates in C++/CLI.

John Saunders · Answer 5 · 2014-12-30T00:36:29.920

0

Most code isn't so performance-critical that taking the time to convert from float to double and back causes a problem:

public void Update(float[] arr, int startIndex, int endIndex)
{
    double[] darr = new double[arr.Length];
    for(int i=startIndex; i<endIndex; i++)
        darr[i] = (double) arr[i];
    Update(darr, startIndex, endIndex);
    for(int j=startIndex; j<endIndex; j++)
        arr[j] = darr[j];
}

Here's a thought experiment. Imagine that instead of copying, you duplicated the code of the double[] version to make a float[] version. Imagine that you optimized the float[] version as much as necessary.

Your question is then: does the copying really take that long? Consider that instead of maintaining two versions of the code, you could spend your time improving the performance of the double[] version.

Even if you had been able to use generics for this, it's possible that the double[] version would want to use different code from the float[] version in order to optimize performance.

edited Dec 30 '14 at 00:36

answered Dec 30 '14 at 00:28

John Saunders

160,644
26
247
397

This method gets invoked roughly 20 million times per second (in total on 32 threads). Although startIndex..endIndex subarray is usually 2-5 elements in size, constant array allocation and deallocations will cause GC overhead. I'm caching almost everything in the program to prevent constant object construction/destruction that is going on, and it improved performance massively. But I haven't actually profiled your proposed update... – Michal Dec 30 '14 at 00:39
3

I believe you're using the wrong language. For that sort of performance, you should be using C++. BTW, C++ templates _will_ allow you to use the same code for both, as they behave similarly to compile-time macros. – John Saunders Dec 30 '14 at 00:46
@JohnSaunders: Except that templates are very type-aware, so the similarity to macros is superficial at best. – Ben Voigt Dec 30 '14 at 00:48
@Ben I mean that in a template (if I remember correctly), arithmetic operators will work. – John Saunders Dec 30 '14 at 00:53
@JohnSaunders: Definitely, and I had just left a comment to that effect on the question when I saw yours. But that doesn't make it text substitution. – Ben Voigt Dec 30 '14 at 00:55
@Ben I didn't say it was simple text substitution. It's _sophisticated_ text substitution. – John Saunders Dec 30 '14 at 00:56
@JohnSaunders: No, it isn't text substitution at all. – Ben Voigt Dec 30 '14 at 00:56
@BenVoigt: ok, end of nit-picking. Call it syntactic substitution, or use whatever analogy you like to show how templates differ from generics, especially with respect to numerics. The point is, the OP could probably do what he wants with templates, but cannot with generics. – John Saunders Dec 30 '14 at 00:58

Method that accepts both float[] and double[] arrays

5 Answers5