1

I have a stream of data in a csv file that signifies the time with date in the 1st column and value in the 2nd column. The data is plotted below. I need to write an algorithm that gives me an array with time according to how long the peak lasted

here is the graph

graph of the data zoomed here is some of the data from the csv file

Column1,Column2
2023-03-14 14:00:59.0,195.80
2023-03-14 14:02:06.0,174.20
2023-03-14 14:03:14.0,156.76
2023-03-14 14:04:21.0,142.36
2023-03-14 14:05:29.0,131.00
2023-03-14 14:06:37.0,122.00
2023-03-14 14:07:44.0,114.91
2023-03-14 14:08:52.0,109.18
2023-03-14 14:10:00.0,104.56
2023-03-14 14:11:07.0,100.74
2023-03-14 14:12:15.0,97.93
2023-03-14 14:13:22.0,95.45
2023-03-14 14:14:30.0,93.43
2023-03-14 14:15:37.0,91.85
2023-03-14 14:16:45.0,90.73
2023-03-14 14:17:53.0,89.49
2023-03-14 14:19:00.0,88.59
2023-03-14 14:20:08.0,87.91
2023-03-14 14:21:15.0,87.13
2023-03-14 14:22:23.0,86.68
2023-03-14 14:23:30.0,86.23
2023-03-14 14:24:38.0,86.23
2023-03-14 14:25:45.0,108.61
2023-03-14 14:26:53.0,142.70
2023-03-14 14:28:01.0,175.89
2023-03-14 14:29:08.0,203.79
2023-03-14 14:30:16.0,225.84
2023-03-14 14:31:23.0,241.25
2023-03-14 14:32:31.0,253.29
2023-03-14 14:33:39.0,262.18
2023-03-14 14:34:46.0,262.29
2023-03-14 14:35:54.0,262.29
2023-03-14 14:37:01.0,262.29
2023-03-14 14:38:09.0,260.83
2023-03-14 14:39:16.0,235.51
2023-03-14 14:40:24.0,208.85
2023-03-14 14:41:31.0,185.45
2023-03-14 14:42:39.0,166.33

some data from csv file

this is my code for peak detection which is not working properly

import pandas as pd
import numpy as np
from datetime import datetime


# Read the data from the CSV file
df = pd.read_csv('test.csv')

# Convert the first column to datetime format
df['Column1'] = pd.to_datetime(df['Column1'])

# Convert the second column to numeric type
df['Column2'] = pd.to_numeric(df['Column2'])

# Find the peaks using numpy
diff1 = np.diff(df['Column2'])
diff2 = np.diff(np.sign(diff1))
peaks, = np.where(diff2 < 0)


peak_durations = np.zeros(len(peaks), dtype=float)
start_times = np.zeros(len(peaks), dtype='datetime64[m]')
for i, peak_index in enumerate(peaks):
    start_index = np.argmax(df['Column2'][:peak_index]) # Index of start of peak
    end_index = np.argmin(df['Column2'][peak_index:]) + peak_index # Index of end of peak
    duration_minutes = (df['Column1'][end_index] - df['Column1'][start_index]).total_seconds() / 60
    peak_durations[i] = duration_minutes
    start_times[i] = df['Column1'][start_index]

# Convert start times to desired string format
start_times_str = [np.datetime_as_string(dt, unit='ms') for dt in start_times]

# Combine start times and durations into a 2-dimensional array
peaks_info = np.vstack((start_times_str, peak_durations)).T

print(peaks_info)

the result i am getting

[['2023-03-14T14:32:00.000' '189.15']
 ['2023-03-14T14:34:00.000' '186.9']
 ['2023-03-14T14:34:00.000' '186.9']
 ['2023-03-14T14:34:00.000' '186.9']
 ['2023-03-14T14:34:00.000' '186.9']
 ['2023-03-14T14:34:00.000' '186.9']
 ['2023-03-14T14:34:00.000' '186.9']
 ['2023-03-14T14:34:00.000' '186.9']
 ['2023-03-14T14:34:00.000' '186.9']
 ['2023-03-14T14:34:00.000' '186.9']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '552.3166666666667']
 ['2023-03-14T14:34:00.000' '561.3333333333334']]

the result i expect is

this is the responce i am getting
     [datetime.datetime(2023, 3, 14, 14, 34) 186.9]
     [datetime.datetime(2023, 3, 14, 14, 34) 186.9]
    
    
    
    i want it to be [(2023 03 14 14 31 00.00) 28 mins]
    
    im this 1st part is date and time for start of the peak and 2nd value is duration of the peak

note: i cant add the csv file here

dermen
  • 5,252
  • 4
  • 23
  • 34
zahab
  • 31
  • 1
  • 5

0 Answers0