Average data over time?

Question

I’d like to calculate an average VAL1, VAL2 and SIGNAL value for the SIGNAL data in my sample data at 5 second intervals. In other words, using the sample data included, I’d like to calculate the average values between the 1st data point (in this case being) 01:45:18 through 01:45:22, 01:45:23 through 01:45:27, 01:45:28 through 01:45:32, and 01:45:33 through the remainder of the data.

Ideally, I’d like to store the averaged information in variables such as: dec_average, ra_average, and n_average

Any suggestions or ideas on how I could achieve this? Here’s the code I have so far.

import sys
import os
import matplotlib.pyplot as plt
from matplotlib.dates import strpdate2num
import numpy as np
import matplotlib.colors
import matplotlib.cm

sat_id,dec,ra,n = np.loadtxt("mydata.asc", usecols=(3,5,7,9), unpack=True)

Sample data:
Timestamp: 01:45:18 SATID 02 VAL1 36 VAL2 188 SIGNAL 34
Timestamp: 01:45:19 SATID 02 VAL1 36 VAL2 188 SIGNAL 34
Timestamp: 01:45:20 SATID 02 VAL1 36 VAL2 188 SIGNAL 35
Timestamp: 01:45:21 SATID 02 VAL1 36 VAL2 188 SIGNAL 34
Timestamp: 01:45:22 SATID 02 VAL1 36 VAL2 188 SIGNAL 35
Timestamp: 01:45:23 SATID 02 VAL1 36 VAL2 188 SIGNAL 35
Timestamp: 01:45:24 SATID 02 VAL1 36 VAL2 188 SIGNAL 36
Timestamp: 01:45:25 SATID 02 VAL1 36 VAL2 188 SIGNAL 35
Timestamp: 01:45:26 SATID 02 VAL1 36 VAL2 188 SIGNAL 36
Timestamp: 01:45:27 SATID 02 VAL1 37 VAL2 188 SIGNAL 36
Timestamp: 01:45:28 SATID 02 VAL1 37 VAL2 188 SIGNAL 36
Timestamp: 01:45:29 SATID 02 VAL1 37 VAL2 188 SIGNAL 36
Timestamp: 01:45:30 SATID 02 VAL1 38 VAL2 188 SIGNAL 37
Timestamp: 01:45:31 SATID 02 VAL1 38 VAL2 188 SIGNAL 36
Timestamp: 01:45:32 SATID 02 VAL1 39 VAL2 188 SIGNAL 37
Timestamp: 01:45:33 SATID 02 VAL1 39 VAL2 188 SIGNAL 37
Timestamp: 01:45:34 SATID 02 VAL1 39 VAL2 188 SIGNAL 37
Timestamp: 01:45:35 SATID 02 VAL1 39 VAL2 188 SIGNAL 38

what is the format of your data? a dataframe, a matrix? Instead of loading a text file, can you post a minimal reproducing example please? — Colonel Beauvel, Apr 07 '17 at 08:24
Possible duplicate: http://stackoverflow.com/questions/13728392/moving-average-or-running-mean — Daniel F, Apr 07 '17 at 09:17

Ziyad Edher · Answer 1 · 2017-04-07T08:34:25.273

First step is to capture the important data from each entry, which are the values of VAL1, VAL2, and SIGNAL, which you have done.

Then for every set of five entries, you need to grab the average of each field, this can be done by just adding each value of the fields and then dividing by five. Using numpy this can be achieved by just using np.average() and passing in the array you want to average, in our case, this will be the first five elements the first time, the next five the second, and so-on, for dec this can be done as follows.

We will create a list dec_average to store the averages of each set of entries.

dec_average = []

This will run through the dec array, averaging each set of five elements as long as there are at least five left, and appending that average to dec_average.

for i in range(5, len(dec) + 1, 5):
    dec_average.append(np.average(dec[(i - 5):i]))

Once we finish running through that loop, if the number of elements in the array is not a multiple of five, then there will still be some left to average. In order to get the average of those, we need to get the last x items where x is the remainder of the division of the length of the array by five; hence, the modulus.

if (len(dec)) % 5 != 0:
    dec_average.append(np.average(dec[-(len(dec) % 5):]))

Putting these three pieces of code together results in a system of calculating the average every five items of a list, and if there is less then five items left in the end, just getting the average of the remaining, and appending each average to an array. This can be extended to suit your other data entries.

Thank you Wintro... this incredibly detailed & thorough explanation really made things super clear. I really appreciate taking the time to explain it so well & thoroughly — luke, Apr 07 '17 at 17:56
My pleasure! I'd recommend checking out Daniel's answer and upvoting/accepting whichever answer suits you. — Ziyad Edher, Apr 07 '17 at 18:03

score 0 · Answer 2 · edited May 23 '17 at 11:54

0

From Alleo's answer here

def running_mean(x, N):
    cumsum = numpy.cumsum(numpy.insert(x, 0, 0)) 
    return (cumsum[N:] - cumsum[:-N]) / N

edited May 23 '17 at 11:54

Community

1
1

answered Apr 07 '17 at 09:19

Daniel F

13,620
2
29
55

When it's a duplicate, please flag it as duplicate instead of copying (even if attributed) an answer. – MSeifert Apr 08 '17 at 01:54

Average data over time?

2 Answers2