-1
            ts_code      low     high
2021-08-01  881105.TI   1485.0  1629.0
2021-08-01  885452.TI   2216.0  2391.0
2021-08-01  885525.TI   7427.0  8552.0
2021-08-01  885641.TI   621.0   671.0
2021-08-08  881105.TI   1496.0  1623.0
2021-08-08  885452.TI   2297.0  2406.0
2021-08-08  885525.TI   7300.0  7868.0
2021-08-08  885641.TI   668.0   691.0
2021-08-15  881105.TI   1606.0  1776.0
2021-08-15  885452.TI   2352.0  2459.0
2021-08-15  885525.TI   7525.0  8236.0
2021-08-15  885641.TI   685.0   719.0
2021-08-22  881105.TI   1656.0  1804.0
2021-08-22  885452.TI   2329.0  2415.0
2021-08-22  885525.TI   7400.0  8270.0
2021-08-22  885641.TI   691.0   720.0

The type of index is datetime64[ns].

Goal

  • select data after date which is the index of max for high column for ts_code group.

Expected

             ts_code    low      high
2021-08-22  881105.TI   1656.0  1804.0
2021-08-15  885452.TI   2352.0  2459.0
2021-08-22  885452.TI   2329.0  2415.0
2021-08-01  885525.TI   7427.0  8552.0
2021-08-08  885525.TI   7300.0  7868.0
2021-08-15  885525.TI   7525.0  8236.0
2021-08-22  885525.TI   7400.0  8270.0
2021-08-22  885641.TI   691.0   720.0

For example, the max date of 881105.TI is 2021-08-22 and 885525.TI is 2021-08-01. The ouput for each ts_code is after the related max date.

Try and ref

The Singularity
  • 2,428
  • 3
  • 19
  • 48
Jack
  • 1,724
  • 4
  • 18
  • 33

1 Answers1

0

Let us try transform with idxmax

df1 = df.reset_index()
df1 = df[df.index >= df.groupby('ts_code')['high'].transform('idxmax')]
out = df1[df1.groupby('ts_code').cumcount()<=1]
out
              ts_code     low    high
2021-08-01  885525.TI  7427.0  8552.0
2021-08-08  885525.TI  7300.0  7868.0
2021-08-15  885452.TI  2352.0  2459.0
2021-08-22  881105.TI  1656.0  1804.0
2021-08-22  885452.TI  2329.0  2415.0
2021-08-22  885641.TI   691.0   720.0
BENY
  • 317,841
  • 20
  • 164
  • 234
  • great, but could you mind explaining why using cumcount() and using chain method to get the result? – Jack Sep 28 '21 at 03:10
  • @Jack you only need the the max and the value right after it , so cumcount will only count number of row less than 2 – BENY Sep 28 '21 at 03:31