0

So I have some complicated analysis to do. I have this data in a df:

    Date/Time   TimeStamp   CallerId    CallType    watts   Band    slot    Channel
0   20:02.0 3113677432  17794800    C1  0.060303    12  1   2
1   20:02.0 3113677432  5520488 OP8 0.302229    12  1   1
2   20:02.0 3113677432  5520488 OP8 0.302229    13  1   1
3   20:02.0 3113677432  5520488 OP8 0.302229    12  2   1
4   20:02.0 3113677432  5520488 OP8 0.302229    13  2   1
5   20:02.0 3113677432  5520488 OP8 0.302229    12  3   1
6   20:02.0 3113677432  5520488 OP8 0.302229    13  3   1
7   20:02.0 3113677432  5520488 OP8 0.302229    12  4   1
8   20:02.0 3113677432  5520488 OP8 0.302229    13  4   1
9   20:07.0 3113677488  17794800    C1  0.151473    12  1   2
10  20:07.0 3113677488  5218651 CC8kds  0.475604    13  4   1
11  20:07.0 3113677488  5514318 BD  1.906933    12  1   6
12  20:11.0 3113677532  17794800    C1  0.038048    12  1   2
13  20:11.0 3113677532  5218651 CC8kds  0.300086    13  4   1
14  20:11.0 3113677532  5501460 PTN3    4.790000    12  1   5
15  21:51.0 3113678643  9895585 CC8kds  0.075378    12  1   1
16  21:51.0 3113678643  5482185 OP8 0.302229    13  1   1
17  21:51.0 3113678643  5482185 OP8 0.302229    13  2   1
18  21:51.0 3113678643  5482185 OP8 0.302229    13  3   1
19  21:51.0 3113678643  5482185 OP8 0.302229    13  4   1
20  21:51.0 3113678643  5513470 PTN3    4.790000    12  3   1
21  21:51.0 3113678643  5518399 PTN3    4.790000    12  3   5

The TimeStamp repeats over many of the items since each row captures a Band (10-20) and a Channel in that band (1-7). The slot further divides the channel into (1-4) time slots. I use a pivot_table to group the watts (power) in each (Band, Channel, slot) as follows:

df = df.pivot_table(index='TimeStamp', columns=['Band','Channel','slot'], values='watts', 
                                       aggfunc=sum, fill_value=0) 

enter image description here What I'd like to do is create a new series, freq = 1200+(Band-1)+Channel/12 and then sum the power in all 4 slots ... for example:

Band 12, Channel 1 @ TimeStamp 3113677432 results in a total pwr of 0.302229+0.302229+0.302229+0.302229 = 1.208916 for a freq value of 1200+(12-1)+1/12=1211.083 ... so I'd have something like this:

TimeStamp   Freq    Pwr ...
3113677432  1211.083    1.208916 ... # This is the total for Band 12, Channel 1 @ this timestamp
3113677432  1211.640    2.208916 ... # this isn't a real total

I need this for each freq computed at each TimeStamp. It would be nice to have those Freq as their own (multilevel) column headers (I'm making up numbers here):

Freq        1211.083    1211.64     1212.04...
TimeStamp 
3113677432  1.208916    2.208916    2.208916...
3113677488  2.058406    0.475604    2.208916...

I can group on one col header like this:

df.groupby(level=0, axis=1).sum() # Groups by Band

or

df.groupby(level=1, axis=1).sum() # Groups by Channel - but across ALL Bands - WRONG

So is there an straightforward way to group by both Band and Channel and sum over all 4 slots?

earnric
  • 410
  • 4
  • 15

1 Answers1

1

There's no need to pivot the table, just do a groupby on multiple columns and you're good to go!

# loading data using solution from https://stackoverflow.com/a/53692642/8718701
from io import StringIO
import pandas as pd


d = '''
    Date/Time   TimeStamp   CallerId    CallType    watts   Band    slot    Channel
0   20:02.0 3113677432  17794800    C1  0.060303    12  1   2
1   20:02.0 3113677432  5520488 OP8 0.302229    12  1   1
2   20:02.0 3113677432  5520488 OP8 0.302229    13  1   1
3   20:02.0 3113677432  5520488 OP8 0.302229    12  2   1
4   20:02.0 3113677432  5520488 OP8 0.302229    13  2   1
5   20:02.0 3113677432  5520488 OP8 0.302229    12  3   1
6   20:02.0 3113677432  5520488 OP8 0.302229    13  3   1
7   20:02.0 3113677432  5520488 OP8 0.302229    12  4   1
8   20:02.0 3113677432  5520488 OP8 0.302229    13  4   1
9   20:07.0 3113677488  17794800    C1  0.151473    12  1   2
10  20:07.0 3113677488  5218651 CC8kds  0.475604    13  4   1
11  20:07.0 3113677488  5514318 BD  1.906933    12  1   6
12  20:11.0 3113677532  17794800    C1  0.038048    12  1   2
13  20:11.0 3113677532  5218651 CC8kds  0.300086    13  4   1
14  20:11.0 3113677532  5501460 PTN3    4.790000    12  1   5
15  21:51.0 3113678643  9895585 CC8kds  0.075378    12  1   1
16  21:51.0 3113678643  5482185 OP8 0.302229    13  1   1
17  21:51.0 3113678643  5482185 OP8 0.302229    13  2   1
18  21:51.0 3113678643  5482185 OP8 0.302229    13  3   1
19  21:51.0 3113678643  5482185 OP8 0.302229    13  4   1
20  21:51.0 3113678643  5513470 PTN3    4.790000    12  3   1
21  21:51.0 3113678643  5518399 PTN3    4.790000    12  3   5
'''

df = pd.read_csv(StringIO(d), sep='\s+')

# calculating total pwr -> sum(watts)


agg_df = (
    df
    .groupby(['TimeStamp', 'Band', 'Channel'], as_index=False)
    .agg({'watts': 'sum'})
)

# freq = 1200+(Band-1)+Channel/12
agg_df['freq'] = (
    1200
    + agg_df['Band']-1
    + agg_df['Channel']/12
)

print(agg_df.to_markdown(index=False, floatfmt=''))

Result:

|    TimeStamp |   Band |   Channel |    watts |               freq |
|-------------:|-------:|----------:|---------:|-------------------:|
| 3113677432.0 |   12.0 |       1.0 | 1.208916 | 1211.0833333333333 |
| 3113677432.0 |   12.0 |       2.0 | 0.060303 | 1211.1666666666667 |
| 3113677432.0 |   13.0 |       1.0 | 1.208916 | 1212.0833333333333 |
| 3113677488.0 |   12.0 |       2.0 | 0.151473 | 1211.1666666666667 |
| 3113677488.0 |   12.0 |       6.0 | 1.906933 | 1211.5             |
| 3113677488.0 |   13.0 |       1.0 | 0.475604 | 1212.0833333333333 |
| 3113677532.0 |   12.0 |       2.0 | 0.038048 | 1211.1666666666667 |
| 3113677532.0 |   12.0 |       5.0 | 4.79     | 1211.4166666666667 |
| 3113677532.0 |   13.0 |       1.0 | 0.300086 | 1212.0833333333333 |
| 3113678643.0 |   12.0 |       1.0 | 4.865378 | 1211.0833333333333 |
| 3113678643.0 |   12.0 |       5.0 | 4.79     | 1211.4166666666667 |
| 3113678643.0 |   13.0 |       1.0 | 1.208916 | 1212.0833333333333 |
Simon David
  • 663
  • 3
  • 13
  • Beautiful! Very clean solution! THanks... I can then do a pivot_table to puts watts under the various freq cols! – earnric May 12 '23 at 17:41