How would I go about optimizing this code?

Question

I am writing a function to find the average of an array in which the array is mostly numbers that would overflow if added all at once.

It works by creating a subarray(b in my code) that is half the input(a in my code) array's size(ar_size in my code) and then places the average of 2 values from the input array a[i+0] and a[i+1] with no overlap into b[j].

Once it iterates through the entire input array, it reruns the function with returning the subarray and the input array size until the size equals 2 and then ends the recursion by returning the average of the two values of b[2].

Please pardon the reuse of j.

Also the size of the array is some power of two.

uint64_t* array_average(uint64_t* a, const int ar_size)
{
    uint64_t* b = new uint64_t[ar_size / 2];

    uint64_t* j = new uint64_t;

    if (ar_size == 2)
    {
     *j = (a[0] / 2) + (a[1] / 2) + ((a[0] % 2 + a[1] % 2) / 2);

     return j;
    }

    for (int i = 0; i < ar_size; i += 2)
    {
        b[*j] = (a[i + 0] / 2) + (a[i + 1] / 2) + ((a[i + 0] % 2 + a[i + 1] % 2) / 2);

        ++*j;
    }
    delete j;
    return array_average(b, ar_size / 2);
}

Also anyone have a better way to average while working with numbers that would cause an overflow to happen?

Here is a revised version:

uint64_t* tools::array_average(uint64_t* a, const int ar_size)
{
    uint64_t* b = new uint64_t[ar_size];
    uint64_t* c = new uint64_t[ar_size / 2];

    int j;
    j = 0;

    for (int i = 0; i < ar_size; ++i)
    {
        b[i] = a[i];
    }

    if (runs > 0) //This is so i do not delete the original input array I.E not done with it
    {
        delete[] a;
    }

    if (ar_size == 2)
    {
        uint64_t* y = new uint64_t;

        runs = 0;

        *y = (b[0] / 2) + (b[1] / 2) + ((b[0] % 2 + b[1] % 2) / 2); 

        delete[] b;

        return y;
    }

    for (int i = 0; i < ar_size; i += 2)
    {
        c[j] = (b[i + 0] / 2) + (b[i + 1] / 2) + ((b[i + 0] % 2 + b[i + 1] % 2) / 2);

        ++j;
    }

    delete[] b;

    ++runs;

    return array_average(c, ar_size / 2);

I wouldn't. I'd fix the memory leak it has before trying to rewrite it in a different form, and eliminate the potential undefined behaviour for the caller that comes by returning the address of an automatic variable. In any event, before asking someone else to convert it, have a go yourself. You'll learn more of use that way. And folks here can help if you get in trouble with a more specific concern. — Peter, Nov 07 '18 at 10:48
I have it working perfectly fine, and ++j is used the increment the b array's position. @molbdnilo — , Nov 07 '18 at 11:09
You've got a tail recursive algorithm. The normal way to covert that into a loop is to recompute the "parameters" each time round the loop. You can start with `while(ar_size > 2)` around your existing code. — Caleth, Nov 07 '18 at 11:11
You did not understand molbdnilo. You have a line `return &j;` which returns a pointer to the variable j. That variable lives in the scope of the function and is removed as soon as the function ends, that is with the return. Thus, you return a pointer to a now invalid address. This is undefined behaviour. As per rule with undefined behaviour, it can do anything, including the thing you want the code to do, which is why your tests might still work. — Aziuth, Nov 07 '18 at 11:16
@user2430974 - `b` is not actually an array. It is a pointer that contains the result of a `new` expression, so it can behave like an array while not actually being an array. In any event, the `new` expression not being matched with a corresponding `delete` expression does cause a leak. Your code seeming to work is just luck - undefined behaviour can seem to behave "correctly" (by whatever criteria you defined) and memory leaks don't necessarily show up in testing. — Peter, Nov 07 '18 at 13:02
@Peter I know you can use `delete[] b;` in the above code to cleanup but placing it anywhere either does not work as desired or causes an access violation. So I ask where (if you could) place the delete operator? — , Nov 07 '18 at 23:57
I see that if you return a pointer like this, you can't delete it before you `return` it and you definitely can't after the `return` statement. — , Nov 08 '18 at 00:06
The `return` statement needs to be divided up. Save the value returned by the recursive call to a variable. Release `b`. Then return the saved value. That won't fix your problems with `j` - although you've edited so `j` is a pointer initialised with a new expression, the function sometimes returns `j` (memory leak) and sometimes deletes it. Your overall problem is that you're resorting to guesswork and prayer to get your code working. You need to be able to describe systematically what it should do, and ensure it does it. — Peter, Nov 08 '18 at 09:47
In respect @Peter returning `j` though. What if you delete it in an outside function? IE my main loop? — , Nov 08 '18 at 11:30
And part of my learning style involves a lot of trial and error and than working back on how it works. — , Nov 08 '18 at 11:34
@user2430974 - Consider what happens if the (programmer writing) the caller forgets to delete the returned pointer, or deletes it incorrectly. No matter what they claim, if programmers use a function that relies on them doing some cleanup, a large proportion of them will do that cleanup incorrectly or not at all. It is better to avoid such problems in the first place. Also, trial and error is always a part of software development, but relying on it to learn will only get you so far with developing as a competent programmer. — Peter, Nov 08 '18 at 11:51
Also in the rest of the code that drives this function relies on the result and I do not wish the delete the original pointer `a` because like `j` I'm not done working with it. — , Nov 08 '18 at 12:17
You're trying to calculate an average i.e. return a single value. Consider whether you need to return a pointer at all, let alone a pointer to something that is dynamically allocated. — Peter, Nov 08 '18 at 14:23

Aziuth · Answer 1 · 2018-11-07T11:51:35.027

3

First of all, be aware that your average is not the actual average, as you do throw away one halfs. The result of your algorithm on an array that alternates between 0 and 1 would be 0, as 0/2 + 1/2 + (0%2 + 1%2)/2 = 0. Wanted to start with that, because that is a serious weakness of your algorithm.

Also note that if the original size is not a power of 2, some data will get a higher weight.

Aside from that, consider this algorithm: Copy the data. Until the data has only one entry left, put the average of cells 0 and 1 in cell 0, that of 2 and 3 in cell 1, 4 and 5 in 2 and so on. Shrink the data after each such step.

As code:

uint64_t average(std::vector<uint64_t> data)
{
    while(data.size() != 1)
    {
        for(size_t i=0; i<data.size()/2; i++)
        {
            data[i] = data[2*i]/2 + data[2*i+1]/2 + /* modular stuff */;
        }
        data.resize(data.size()/2 + data.size()%2); //last part is required if the size is not an even number
    }
    return data[0];
}

Using a proper container here also gets rid of your memory leak, by the way.

Note that this code still has the weakness I talked about. You could extent it by collecting the halves, that is if your modular part is 1, you increase a variable, and when the variable is at two, you add a one in some cell.

Edit: If the input HAS to be a raw array (because you receive it from some external source, for example), use this:

uint64_t average(uint64_t* array, const int array_size)
{
    std::vector<uint64_t> data(array, array + array_size);

    (rest of the code is identical)

Edit: code above with collecting halves:

inline uint64_t average(const uint64_t& a, const uint64_t& b, uint8_t& left_halves)
{
    uint64_t value = a/2 + b/2 + (a%2 + b%2)/2;
    if((a%2 + b%2)%2 == 1)
    {
        left_halves += 1;
    }
    if(left_halves == 2)
    {
        value += 1;
        left_halves = 0;
    }
    return value;
}

uint64_t average(std::vector<uint64_t> data)
{
    if(data.size() == 0) return 0;

    uint8_t left_halves = 0;
    while(data.size() != 1)
    {
        for(size_t i=0; i<data.size()/2; i++)
        {
            data[i] = average(data[2*i], data[2*i+1], left_halves);
        }
        data.resize(data.size()/2 + data.size()%2); //last part is required if the size is not an even number
    }
    return data[0];
}

Still has the weakness of increased cell weight if size is not a power of two.

edited Nov 07 '18 at 11:51

answered Nov 07 '18 at 11:06

Aziuth

3,652
3
18
36

What if I wanted to not use vectors in this specific situation? – Nov 07 '18 at 11:15
Then you replace it by a raw array and store it's size in some integer. Note that you then need to `delete` it to avoid a memory leak. However, `std::vector` does only use up marginally more memory than a raw array. In almost any situation, you'd want to use a container. If you get the raw array from some part that you do not own, then create a vector out of the input where my code copies the data (`data = input_data`). – Aziuth Nov 07 '18 at 11:18
Edited my answer to address that. – Aziuth Nov 07 '18 at 11:23
1

and "marginally more" here means the difference between 3 pointers and 1 pointer and an int. Also you `delete[]` things that you `new[]`ed, `delete`ing an array is UB. – Caleth Nov 07 '18 at 11:25
I'd take `input_data` by value, rather than copy in the body. – Caleth Nov 07 '18 at 11:28
afaik in `x = 2^y` the potence is `y` while I think you wanted to refer to `x` which is a power of 2 – 463035818_is_not_an_ai Nov 07 '18 at 11:28
@Caleth: Well, depends on the usage. Thing is, I would not automatically assume that a function changes an input parameter, at least if the input parameter is not obviously an implicit return value. Might cause problems if the user does not realize that. But of course, doing it like you say would make it a bit faster and more memory efficient. – Aziuth Nov 07 '18 at 11:32
@user463035818 Right, changed that. – Aziuth Nov 07 '18 at 11:35
No change is visible to the caller when passing by value. The parameter ceases to exist during the return. If it were copied from something, what it copied is unaffected – Caleth Nov 07 '18 at 11:40
@Caleth Ah yeah, had a blockade in my mind, edited. I usually use references since many algorithms do not require a copy, and therefore I automatically used one, thought at first you'd mean it with it staying a reference. – Aziuth Nov 07 '18 at 11:47

Jarod42 · Answer 2 · 2018-11-07T11:27:07.533

You might use:

constexpr bool is_power_of_2(uint64_t n)
{
    return n && !(n & (n - 1));
}

uint64_t array_average(std::vector<uint64_t> v)
{
    if (!is_power_of_2(v.size())) {
        throw std::runtime_error("invalid size");
    }
    uint64_t remainder = 0;
    while (v.size() != 1) {
        for (int i = 0; i != v.size(); i += 2) {
            remainder += (a[i] % 2 + a[i + 1] % 2);
            b[i / 2] = a[i] / 2 + a[i + 1] / 2;
            if (remainder >= 2 && b[i / 2] < -(remainder / 2)) {
                b[i / 2] += remainder / 2;
                remainder %= 2;
            }
        }
        v.resize(v.size() / 2);
    }
    return v[0] + remainder / 2;
}

Francis Cugler · Answer 3 · 2018-11-07T15:20:50.240

There really shouldn't be that much to convert as there are containers, functions and algorithms in the stl that already exist that will do this for you. With out any function examine this short program:

#include <vector>
#include <numeric>
#include <iostream>
#include <exception>

int main() {
    try {

        std::vector<uint64_t> values{ 1,2,3,4,5,6,7,8,9,10,11,12 };
        int total = std::accumulate( values.begin(), values.end(), 0 );
        uint64_t average = static_cast<uint64_t>( total ) / values.size();
        std::cout << average << '\n';

    } catch( const std::runtime_error& e ) {
        std::cerr << e.what() << '\n';
        return EXIT_FAILURE;
    }
    return EXIT_SUCCESS;
}

On my machine windows 7 ultimate 64bit running visual studio 2017 CE compiled with language version set to most recent c++17 or greater. This does give me a compiler warning! Warning: C4244 generated due to conversion and possible loss of data. However there are no compiler errors and it does run and give the expected result. The output here is 6 as expected since integer division is truncated. If I change these lines of code above to this:

double total = std::accumulate( values.begin(), values.end(),
                            static_cast<double>( 0 ) );
double average = total / values.size();

It fixes the compiler warnings above by adding the static_cast and it sure enough prints out 6.5 which is the actual value.

This is all fine and good since the vector is already initialized with values; however, this may not be always the case so let's move this into a function that will take an arbitrary array. It would look something like this:

uint64_t array_average( std::vector<uint64_t>& values ) {
    // Prevent Division by 0 and early return 
    // as to not call `std::accumulate`
    if ( !values.empty() ) {
        // check if only 1 entry if so just return it
        if ( values.size() == 1 ) {
            return values[0];
        } else { // otherwise do the calculation.
            return std::accumulate( values.begin(), values.end(),
                                    static_cast<uint64_t>( 0 ) ) / values.size();
        } 
    } 
    // Empty Container 
    throw std::runtime_error( "Can not take average of an empty container" );
}

This function is nice and all, we can do better by improving this by making it a little more generic that will work with any arithmetic type!

template<typename T>
T array_average( std::vector<T>& values ) {
    if( std::is_arithmetic<T>::value ) {
        if( !values.empty() ) {
            if( values.size() == 1 ) {
                return values[0];
            } else { 
                return std::accumulate( values.begin(), values.end(), static_cast<T>( 0 ) ) / values.size();
            }
        } else {
            throw std::runtime_error( "Can not take average of an empty container" ); 
        }
    } else {
        throw std::runtime_error( "T is not of an arithmetic type" );
    }
}

At first glance this looks okay. This will compile and run if you use this with types that are arithmetic. However, if we use it with a type that isn't this will fail to compile. For example:

#include <vector>
#include <numeric>
#include <iostream>
#include <exception>
#include <type_traits>

class Fruit {
protected:
     std::string name_;
public:
    std::string operator()() const {
        return name_;
    }
    std::string name() const { return name_; }

    Fruit operator+( const Fruit& other ) {
        this->name_ += " " + other.name();
        return *this;
    }
};

class Apple : public Fruit {
public:
    Apple() { this->name_ = "Apple"; }

};

class Banana : public Fruit {
public:
    Banana() { this->name_ = "Banana"; }
};

class Pear : public Fruit {
public:
    Pear() { this->name_ = "Pear"; }
};

std::ostream& operator<<( std::ostream& os, const Fruit& fruit ) {
    os << fruit.name() << " ";
    return os;
}

template<typename T>
T array_average( std::vector<T>& values ); // Using the definition above

int main() {
    try {
        std::vector<uint64_t> values { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 };
        std::vector<double> values2 { 2.0, 3.5, 4.5, 6.7, 8.9 };
        std::vector<Fruit> fruits { Apple(), Banana(), Pear() };

        std::cout << array_average( values ) << '\n';  // compiles runs and prints 6
        std::cout << array_average( values2 ) << '\n'; // compiles runs and prints 5.12
        std::cout << array_average( fruits ) << '\n'; // fails to compile.

    } catch( const std::runtime_error& e ) {
        std::cerr << e.what() << '\n';
        return EXIT_FAILURE;
    }
    return EXIT_SUCCESS;
}

This fails to compile because the static_cast can not convert int to T with T = Fruit MSVC compiler error C2440

We can fix this by changing a single line of code in our function template if your compiler supports it:

We can change if( std::is_arithmetic<T>::value ) to if constexpr( std::is_arithmetic<T>::value ) and our function will now look like this:

template<typename T>
T array_average( const std::vector<T>& values ) {
    if constexpr( std::is_arithmetic<T>::value ) {
        if( !values.empty() ) {
            if( values.size() == 1 ) {
                return values[0];
            } else {
                return std::accumulate( values.begin(), values.end(), static_cast<T>( 0 ) ) / values.size();
            }
        } else {
            throw std::runtime_error( "Can not take average of an empty container" );
        }
    } else {
        throw std::runtime_error( "T is not of an arithmetic type" );
    }
}

You can run the same program above and it will fully compile even when you are using types that are not arithmetic.

int main() {
    //....
    std::cout << array_average( fruits ) << '\n'; // Now compiles
    //...
}

However when you run this code it will generate a Runtime Error and depending on how your IDE and debugger is setup you may need to put a break point within the catch statement where the return EXIT_FAILURE is to see the message printed to the screen, otherwise the application may just exit without any notification at all.

If you don't want runtime errors you can substitute and produce compiler time errors by using static_assert instead of throwing a runtime error. This can be a handy little function, but it isn't 100% without some minor limitations and gotchas, but to find out more information about this function you can check the Question that I had asked when I was writing the implementation to this function that can be found here and you can read the comments there that will give you more insight to some of the limitations that this function provides.

One of the current limitations with this function would be this: let's say we have a container that has a bunch of complex numbers (3i + 2), (4i - 6), (7i + 3) well you can still take the average of these as it is a valid thing, but the above function will not consider this to be arithmetic in it's current state.

To resolve this issue what can be done is this: instead of using std::is_arithmetic<t> you could write your own policy and traits that this function should accept. I'll leave that part as an exercise for you.

As you can see a majority of the work is already being done for us with the standard library. We used accumulate and divided by the containers size and we were done, the rest of the time involved was making sure it accepts proper types, if it's to be thread safe and or exception safe etc.

Finally we did not have to worry about cumbersome for loops on arrays and making sure the loops didn't exceed the size of the array. We did not have to call new and worry about when and where to call delete in order to not have any memory leaks. ASFAIK I do not think that std::accumulate will overflow on supporting containers, but don't quote me on this one. It may depend on the types that are in the container and that there is a static_cast involved. Even with some of these caveats in many cases it is still better to use containers than managing your own raw memory, and to use the algorithms and functions that are designed to work on them. They make things a lot simpler and easier to manage and even debug.

I think you missed the point of his problem. Your first block will surely not work if you have an array that stores the maximum of uint64_t one thousand times, while the idea of his algorithm would. Your second block which uses a double will obviously cause a loss of information in such a case, for the fact that a double can't hold more information than a uint64_t. I strongly assume that the guy works with very large numbers that come close to the maximum of uint64_t, and/or with very large arrays, and I also assume that the result is desired to be as precise as possible. — Aziuth, Nov 07 '18 at 22:45
You would be correct in assuming @Aziuth. With what I'm working with the arrays are ~1GB (134217728 items at 64bits) in size and even the smallest value would cause an overflow quite quickly. — , Nov 08 '18 at 00:19
It appears that the question was reworded since the time of me writing my answer and now that I have read it again with the current updates it is a little more clear of what is being asked. At first it appeared that the OP wanted to convert the algorithm or function into something simpler. Now it makes sense that you are `expecting overflow` and to that my answer is now irrelevant and what the OP is looking for is something on the lines of a huge integer library function or implementation with efficiency at its core design. — Francis Cugler, Nov 08 '18 at 01:19

How would I go about optimizing this code?

3 Answers3