0

In Python, I'm trying to measure the delta in seconds between 2 events, precisely 2 complementary measures:

  1. The time (in seconds) since an event started.

  2. The time (in seconds) between two events.

  3. I want to create a new column that calculates the time since the big_volume event in the 'volume_CAT' column started.

  4. I want to create a new column that measures in the 'range_candle' column, the time between the start and end of the range_candle event.

I have tried various methods without success. For example in the 1st case I tried

for vol in df["Volume"]: 
   if df["Volume"] >  2.2e+06:
       return Timedelta.total_seconds()

Can anyone help me get my code to work?


To clarify my question: I want to measure the interval between 2 events.

For context and an example, I want to make different measurements of the volume of a stock (Amazon...): The index is a datetime column which contains the date, hour, minute and second. Each line corresponds to one minute The first column corresponds to the volume, i.e. the number of shares traded. The second column is a qualitative feature according to the number of shares traded per minute (-50: normal / 50 to 100: big / +100: XL)

import pandas as pd

df = pd.DataFrame({
'Time' : ['2022-01-11 09:30:00', '2022-01-11 09:31:00', '2022-01-11 
09:32:00', '2022-01-11 09:33:00', 
          '2022-01-11 09:34:00', '2022-01-11 09:35:00',],
'Volume' : ['71', '53', '84', '164', '43', '21'],
'Volume_cat' : ['big_volume','big_volume', 'big_volume', 
'xl_volume','normal_volume', 'normal_volume']
})

df['Time'] = pd.to_datetime(df['Time'])
df.set_index(['Time'], inplace =True)
df

My first goal is to have 1 new column for each modality of "Volume_cat". This column will display the time elapsed in seconds, since the modality was detected, for example :

df['delta_xl_vol'] = ['nan', 'nan', 'nan', '19', 'nan', 'nan',]
df['delta_big_vol'] = ['21', '81', '141', 'nan', 'nan', 'nan']
df['delta_normal_vol'] = ['nan', 'nan', 'nan', 'nan', '60', '120']
df

Finally a second column category should indicate how long ago the "Volume_cat" mode changed. This column is reset each time the mode appears (example with the last two lines):

df = pd.DataFrame({
'Time' : ['2022-01-11 09:30:00', '2022-01-11 09:31:00', '2022-01-11 
09:32:00', '2022-01-11 09:33:00', 
          '2022-01-11 09:34:00', '2022-01-11 09:35:00', '2022-01-11 
09:36:00', '2022-01-11 09:37:00'],
'Volume' : ['71', '53', '84', '164', '43', '21', '66', '53'],
'Volume_cat' : ['big_volume','big_volume', 'big_volume', 
'xl_volume','normal_volume', 'normal_volume', 'big_volume',
               'big_volume'],
'Interval_xl_vol': ['nan', 'nan', 'nan', 'nan', '60',' 120', '180', 
'240'],
'Interval_big_vol': ['nan', 'nan', 'nan','60', '120', '180', 'na', 
'60'],
'Interval_normal_vol': ['nan', 'nan', 'nan', 'nan', 'nan', '60', 
'120', '180']
})
df.set_index(['Time'], inplace =True)
df

The problems I'm having: calculating the delta in seconds, inserting these calculations in columns.

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
Olivier
  • 69
  • 8
  • 1
    Please provide a complete example of your problems (see [How to create a Minimal, Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example) and [How to make good reproducible pandas examples](https://stackoverflow.com/q/20109391/14311263)). – Timus Feb 19 '22 at 21:31
  • This `df["Volume"] > 2.2e+06` is a series, not `True`/`False`. Did you mean to write `if vol > 2.2e+06:` instead? – Timus Feb 19 '22 at 21:35

0 Answers0