Pandas: Find end frequency spectrum above a defined threshold

Question

long time reader, first time posting.

I am working with x,y data for frequency response plots in Pandas DataFrames. Here is an example of the data and the plots (see full .csv file at end of post):

fbc['x'],fbc['y']

(0    [89.25, 89.543, 89.719, 90.217, 90.422, 90.686...
 1    [89.25, 89.602, 90.422, 90.568, 90.744, 91.242...
 2    [89.25, 89.689, 89.895, 90.305, 91.008, 91.74,...
 3    [89.25, 89.514, 90.041, 90.275, 90.422, 90.832...
 Name: x, dtype: object,
 0    [-77.775, -77.869, -77.766, -76.572, -76.327, ...
 1    [-70.036, -70.223, -71.19, -71.229, -70.918, -...
 2    [-73.079, -73.354, -73.317, -72.753, -72.061, ...
 3    [-70.854, -71.377, -74.069, -74.712, -74.647, ...
 Name: y, dtype: object)

where x = frequency and y = amplitude data. The resulting plots for each of these looks as follows:

See x,y Plot of image in this link - not enough points to embed yet

I can create a plot for each row of the x,y data in the Dataframe.

What I need to do in Pandas (Python) is identify the highest frequency in the data before the frequency response drops to the noise floor (permanently). As you can see there are places where the y data may go to a very low value (say <-50) but then return to >- 40.

How can I detect in Pandas / python (ideally without iterations due to very large data sizes) to find the highest frequency (> -40) such that I know that the frequency does not return to < -40 again and then jump back up? Basically, I'm trying to find the end of the frequency band. I've tried working with some of the Pandas statistics (which would also be nice to have), but have been unsuccessful in getting useful data.

Thanks in advance for any pointers and direction you can provide.

Here is a .csv file that can be imported with csv.reader: https://www.dropbox.com/s/ia7icov5fwh3h6j/sample_data.csv?dl=0

*As you can see there are places*, no there is none in the sample data. You should add a small sample data and expected output instead of that incomplete data. — Quang Hoang, Mar 11 '20 at 16:50
How about a method that iterates back from the last observation and returns the frequency of the first observation that is > -40 amplitude? Would that fit your goals? — katardin, Mar 11 '20 at 16:56
Hi @QuangHoang thanks for the suggestion, I added a sample.csv file which contains the data set I am using. — Brady Volpe, Mar 11 '20 at 17:08
Hi @katardin working backwards is a great idea to eliminate false positives. Yes this would always work to fit my requirements. However, I still need to do this without iterating through each row. The dataset I attached only has a few rows, but the final data set will have 10's of thousands of rows. — Brady Volpe, Mar 11 '20 at 17:10

score 0 · Answer 1 · answered Mar 11 '20 at 20:13

I believe I have come up with a solution:

Based on a suggestion from @katardin I came up with the following, though I think it can be optimized. Again, I will be dealing with huge amounts of data, so if anyone can find a more elegant solution it would be appreciated.

for row in fbc['y']:
    list_reverse = row

    # Reverse y data so we read from end (right to left)
    test_list = list_reverse[::-1]

    # Find value of y data above noise floor (>-50)
    res = next(x for x, val in enumerate(test_list) if val > -50) 

    # Since we reversed the y data we must take the opposite of the returned res to 
    # get the correct index
    index = len(test_list) - res

    # Print results
    print ("The index of element is : " + str(index))

Where the output is index numbers as follows:

The index of element is : 2460
The index of element is : 2400
The index of element is : 2398
The index of element is : 2382

Each one I have checked and corresponds to the exact high frequency roll-off point I have been looking for. Great suggestion!

Pandas: Find end frequency spectrum above a defined threshold

1 Answers1