Create a new variable which averages values in different ranges

Question

I have a table of data and I want to create an average value the variable (f), for it to start at when the counter is at 1 and end before it becomes one again

This is what the start of the dataframe looks like

f      counter
49.798  1
49.797  2
49.793  3
49.792  4
49.794  5
50.203  1
50.201  1
50.201  2
50.202  1
50.205  2
50.206  3
50.209  4
50.21   5
50.212  6
50.21   7
50.211  8
50.211  9
50.211  10
50.212  11
50.21   12
50.206  13
50.205  14
50.206  15
50.201  16

The output should be like this:

Average
49.7948
50.203
50.201
50.2079375

I have no idea how to go about doing this

I have tried this to just sum the values but it doesnt work

def sum_f(x):
    global total 
    if counter  == 1:
        total == f
        return int(total)
        if counter == 1:
            total == f 
            return int(total)
        else:
            total =+ f
            return int(total)

Initialize a var named total, do a loop and add f to total; if counter is 1 then get the ave and save result. then init total again and continue until you reach the end. Try it. — jose_bacoy, Apr 29 '19 at 13:13
^ this, if you want to know how to iterate over dataframe rows, you can look here: https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas — mrapacz, Apr 29 '19 at 13:14
Please make an attempt to solve it yourself using the hints above and update the question with your work if you run into problems. It only requires for-loops and if-else statements. Google is your friend. — Adarsh Chavakula, Apr 29 '19 at 13:41
You have 4 rows whose counter is 1. How do you expect 5 average values? There is one too much. — Valentino, Apr 29 '19 at 14:11
Your function does not work because you never use `x`, which I suppose should be the dataframe. By the way, are you using `pandas` or other dataframe library? I fail to understand if `f` and `counter` are columns of a real dataframe or just lists. In your function you seems to consider them as simple variables. — Valentino, Apr 29 '19 at 14:22

score 3 · Accepted Answer · answered Apr 29 '19 at 14:36

Here we create a new run column that increases whenever the value is 1. Then we group by that column and take the mean of the f values:

df['run'] = (df.counter == 1).cumsum()

df.groupby('run').f.agg(np.mean)

results in

run
1    49.794800
2    50.203000
3    50.201000
4    50.207938
Name: f, dtype: float64

Create a new variable which averages values in different ranges

1 Answers1