1

I have this problem, where I want to find a highest value in each segment. Each segment means time, so all the rows corresponding to time, as you can see most of time step is five minute and for each step I need to find highest value in the 4th column, during that I need to save the whole row. So far I came up with this:

import pandas as pd
import numpy as np

data = pd.read_csv(f'/home/20170116.csv', header=None, sep=';',
                   usecols=[0, 1, 2, 3, 4, 5], names=['Time', 'degree', 'f1', 'p1', 'Intensity', 'Distance'])

for i in range(1, 5473, 19):
    print(data.iloc[:i])

My data looks like this:

00:00   0   7.44077320746235    0.453540438929378   317900000   67
00:00   10  7.39076196898179    0.487011284672025   341400000   67
00:00   20  7.37075747358957    0.506065836725554   328000000   65
00:00   30  7.34075073050124    0.495374317737197   321000000   65
00:00   40  7.33074848280513    0.473928991378983   379500000   70
00:00   50  7.33074848280513    0.429714866376765   344100000   70
00:00   60  7.34075073050124    0.378940997444553   461400000   77
00:00   70  7.37075747358957    0.330831053566623   402800000   77
00:00   80  7.43077095976624    0.28999520431443    353100000   77
00:00   90  7.50078669363902    0.256630783010184   312400000   77
00:00   -90 7.51078894133513    0.257848411262383   114700000   52
00:00   -80 7.59080692290402    0.226286016578661   92620000    48
00:00   -70 7.71083389525736    0.199411631799538   81620000    48
00:00   -60 7.81085637221848    0.178324045166602   217100000   77
00:00   -50 7.87086985839514    0.17447741754611    212400000   77
00:00   -40 7.8308608676107     0.209620778938056   276100000   78
00:00   -30 7.73083839064958    0.272603273214342   359100000   78
00:00   -20 7.61081141829625    0.341747195487005   361600000   75
00:00   -10 7.51078894133513    0.401902182098869   260500000   65

So above one segment is presented time increases every 5 minutes so I have 288 segments and each has 19 rows. And I need to find max value in the 4th column p1 and save the whole row to another file for example.

Hiddenguy
  • 547
  • 13
  • 33

2 Answers2

4

Does this work:

df.loc[df.groupby('Time')['p1'].idxmax()]

Output:

    Time  degree        f1        p1  Intensity  Distance
1  00:00      20  7.370757  0.506066  328000000        65
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
2

drop_duplicates

df.sort_values(['Time', 'p1']).drop_duplicates('Time', keep='last')

    Time  degree        f1        p1  Intensity  Distance
2  00:00      20  7.370757  0.506066  328000000        65
piRSquared
  • 285,575
  • 57
  • 475
  • 624