0

There is very similar question that I will base on: How to create list from 100 elements to list of 10

In my case list looks like this:

e.g.

a = [[[100], ['Wed Sep 25 13:01:51 2019']], 
[[200], ['Wed Sep 25 13:01:51 2019']],
[[300], ['Wed Sep 25 13:01:52 2019']],
[[400], ['Wed Sep 25 13:01:52 2019']], 
[[500], ['Wed Sep 25 13:01:53 2019']], 
[[600], ['Wed Sep 25 13:01:53 2019']], 
[[700], ['Wed Sep 25 13:01:54 2019']]
[[800], ['Wed Sep 25 13:01:54 2019']]]

So as you can see, each sample is taken every half second. I've got problem with proper 'equalizing' of such list. I would like to make another list that will be mean of values with timestamp from first or last row of second that sample was taken.

I tried to edit such converter from attached link on stack, so the outcome of performing this code:

A = [[[100], ['Wed Sep 25 13:01:51 2019']],
[[200], ['Wed Sep 25 13:01:51 2019']],
[[300], ['Wed Sep 25 13:01:52 2019']],
[[400], ['Wed Sep 25 13:01:52 2019']],
[[500], ['Wed Sep 25 13:01:53 2019']],
[[600], ['Wed Sep 25 13:01:53 2019']],
[[700], ['Wed Sep 25 13:01:54 2019']],
[[800], ['Wed Sep 25 13:01:54 2019']]]

B = [sum(A[i:i+2][0][0])/2 for i in range(0, len(A), 2)]
print(B)

Would give desired result like:

b = [[[150], ['Wed Sep 25 13:01:51 2019']], 
    [[350], ['Wed Sep 25 13:01:52 2019']],
    [[550], ['Wed Sep 25 13:01:53 2019']],
    [[750], ['Wed Sep 25 13:01:54 2019']]], 

Although it gives result:

[50.0, 150.0, 250.0, 350.0]

which is firstly wrong, and secondly without timestamp. What should I modify?

3 Answers3

1

Here's one approach using itertools.groupby, which has the nice advantage of working for an arbitrary amount of sampling rate:

from itertools import groupby
from statistics import mean

b = []
f = lambda i: i[1][0].rsplit(maxsplit=2)[-2]
for k,v in groupby(a, key=f):
    z = list(zip(*list(v)))
    mean_ = mean(i[0] for i in z[0])
    b.append([[mean_], z[1][1]])

print(b)

[[[150], ['Wed Sep 25 13:01:51 2019']],
 [[350], ['Wed Sep 25 13:01:52 2019']],
 [[550], ['Wed Sep 25 13:01:53 2019']],
 [[750], ['Wed Sep 25 13:01:54 2019']]]

Though the recommended way of working with time series data, is to use pandas rather than lists:

from itertools import chain
import pandas as pd

l = list([[*chain.from_iterable(i)] for i in a])
pd.DataFrame(l).groupby(1)[0].mean()

1
Wed Sep 25 13:01:51 2019    150
Wed Sep 25 13:01:52 2019    350
Wed Sep 25 13:01:53 2019    550
Wed Sep 25 13:01:54 2019    750
Name: 0, dtype: int64
yatu
  • 86,083
  • 12
  • 84
  • 139
0

Going with your approach, a few changes are needed:

  • your sum currently sums one object: the first number. So we want to change it to sum the number of each row: sum(a[0][0] for a in A[i:i+2]).

  • you create the list of just those calculations without the matching timestamp. So all in all:

A = [[[100], ['Wed Sep 25 13:01:51 2019']],
[[200], ['Wed Sep 25 13:01:51 2019']],
[[300], ['Wed Sep 25 13:01:52 2019']],
[[400], ['Wed Sep 25 13:01:52 2019']],
[[500], ['Wed Sep 25 13:01:53 2019']],
[[600], ['Wed Sep 25 13:01:53 2019']],
[[700], ['Wed Sep 25 13:01:54 2019']],
[[800], ['Wed Sep 25 13:01:54 2019']]]

B = [[[sum(a[0][0] for a in A[i:i+2])/2], A[i][1]] for i in range(0, len(A), 2)]
print(B)

Gives:

[[[150.0], ['Wed Sep 25 13:01:51 2019']], 
 [[350.0], ['Wed Sep 25 13:01:52 2019']], 
 [[550.0], ['Wed Sep 25 13:01:53 2019']], 
 [[750.0], ['Wed Sep 25 13:01:54 2019']]]
Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
0

Try below code:

b = [[sum([A[i][0][0],A[i+1][0][0]])/2,A[i][1]] for i in range(0,len(A),2)]
print(b)

Output:
[[150.0, ['Wed Sep 25 13:01:51 2019']], [350.0, ['Wed Sep 25 13:01:52 2019']], [550.0, ['Wed Sep 25 13:01:53 2019']], [750.0, ['Wed Sep 25 13:01:54 2019']]]

Hope this helps!

Bharat Gera
  • 800
  • 1
  • 4
  • 13