3

I've written a Fortran function that calculates the moving average of a 1D array of numbers in a very straightforward way:

function moving_average(data, w)

    implicit none

    integer, intent(in) :: w
    real(8), intent(in) :: data(:)

    integer :: n, i
    real(8) :: moving_average(size(data)-w+1)

    n = w-1

    do i=1, size(data)-n
        moving_average(i) = mean(data(i:i+n))
    end do

end function

Where the function mean is defined as:

real(8) function mean(data)

    implicit none

    real(8), dimension(:), intent(in) :: data

    mean = sum(data)/size(data)

end function

When running the function moving_average on my laptop with a data set of 100000 numbers and a window width of 1000, it takes 0.1 seconds. However, the function running_mean in this post using numpy takes only 1 ms. Why is my algorithm so much slower?

Tofi
  • 229
  • 1
  • 7

1 Answers1

6

Your algorithm is of the order O(n*m) with n the size of the moving average and m the size of the array.

Every time you compute a point in the array moving_average, you do the following steps:

  • extract a part of the array
  • compute the sum over that slice
  • divide by the constant n

However, moving_average(i) and moving_average(i+1) are related in the following way:

moving_average(i+i) = moving_average(i) + (data(i+n) - data(i-1))/n

When you use this, you can reduce the computational time from O(n*m) to O(m)

kvantour
  • 25,269
  • 4
  • 47
  • 72