Trying to explore time-frequency analysis to obtain valuable information about experimental signals.
I need to obtain such a plot, that will visualize the change of frequency spectrum in time - for all signal duration - to extract some valuable information about the current signal.
Trying to use wavelets for this purpose.
Here is a link with example I am using to compute coefficients and visualize the spectrogram.
I receive a spectrogram that doesn't reflect the data I have, it seems I have some misunderstanding of the visualization technique or wavelet calculation.
Tech. question. How to show the initial signal and the wavelet spectrogram on one figure and share x-axis? (to be able explore the data in convenient way)
I can't see the resonances in the first part of a signal, but I can do it relatively easily using the signal in a time domain. Why?
What is the data in the upper left and right corners - rounded region?
Maybe I am choosing the wrong
scales
orlevels
(see arrays in the code provided, I have played with the numbers but I don't get how to choose their values?)How to show frequencies on the left scale - not a periods..?
how to add a colormap to understand the current power coefficients value?
Maybe there is some instrument/approach - to obtain
levels
andscales
values automatically? Or some general recommendation/tutorial to understand how to get smooth visualization of data for time-frequency analysis?
The example to reproduce and example data.
# wavelet application for time-frequency analysis
# example used https://ataspinar.com/2018/12/21/a-guide-for-using-the-wavelet-transform-in-machine-learning/
import pywt
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib widget
### loading data
# initial data for analysis
tr_link = 'https://drive.google.com/uc?export=download&id=1FEQvfqQ8BE1rS8BuTqQeRsTBJx_kiH15'
# input dataframe - transmission data - experimental results with the corresponding theoretical data and uncertainty data
tr_df = pd.read_csv(tr_link, index_col=0)
#data with the actual resonance positions
ladder_df = pd.DataFrame({'E': {0: 40.29368109287083, 1: 55.462104682940094, 2: 66.19723693698184, 3: 81.22602772049886, 4: 101.05019001782576, 5: 140.79384791944875, 6: 157.22007983931803, 7: 171.43325808950274, 8: 217.16542649339505, 9: 221.41056096126604, 10: 243.58428941894317, 11: 271.0699959595112, 12: 285.04407065596706, 13: 300.3253362518625, 14: 309.7129973708237, 15: 314.68310009948743, 16: 323.26453920844216, 17: 342.54164584343937, 18: 356.5478102534998, 19: 382.23962909125567, 20: 397.8070178951269, 21: 421.2843581368074, 22: 433.36971676875646, 23: 440.66505067627634, 24: 466.7307530820385, 25: 477.03060262148057, 26: 482.96167606482015, 27: 497.424475651996, 28: 515.5927570438026, 29: 544.9552781653138, 30: 549.3628110144758, 31: 594.9779029029645, 32: 613.1079845026825, 33: 634.7001884990783, 34: 667.2508925896644, 35: 690.0964618342111, 36: 706.2573081949196, 37: 723.455953863036, 38: 753.9685262219448, 39: 764.1023564395969, 40: 783.3375513683885, 41: 803.4441302741657, 42: 814.9006657436429, 43: 829.3065932509774, 44: 858.2810554651402, 45: 883.026051087727, 46: 913.8980609316312, 47: 933.3841005436072, 48: 936.9012403488496, 49: 960.07254411068, 50: 970.1599371526584, 51: 988.6230483408726, 52: 994.1094565980452}, 'tof': {0: 0.00040406988890069487, 1: 0.0003449018934825362, 2: 0.00031598105788645324, 3: 0.00028557849361103294, 4: 0.00025638211406266864, 5: 0.00021771059568433755, 6: 0.00020620234645780873, 7: 0.00019761038050053123, 8: 0.00017594559752920423, 9: 0.00017428275970206476, 10: 0.00016631594547793704, 11: 0.0001578317819181072, 12: 0.0001539969269073221, 13: 0.00015011363900030704, 14: 0.00014787189376091746, 15: 0.00014672587278245796, 16: 0.00014480971199822796, 17: 0.00014077095253061672, 18: 0.00013804430306554194, 19: 0.00013343808731047422, 20: 0.00013086684344748386, 21: 0.0001272621057195108, 22: 0.00012552178822158457, 23: 0.00012450607469089692, 24: 0.00012107366944339282, 25: 0.00011979555386093621, 26: 0.00011907818543729406, 27: 0.00011738300517409577, 28: 0.00011535543399251944, 29: 0.00011229554167237238, 30: 0.00011185753091262304, 31: 0.00010761419597738966, 32: 0.0001060606849751197, 33: 0.00010429807317815873, 34: 0.00010180440741783211, 35: 0.00010016063185845586, 36: 9.904631835791354e-05, 37: 9.790170126465565e-05, 38: 9.596823414123544e-05, 39: 9.535185526580462e-05, 40: 9.421496658465192e-05, 41: 9.307049217268802e-05, 42: 9.243740952453541e-05, 43: 9.166004036892226e-05, 44: 9.015621997184851e-05, 45: 8.893095729395043e-05, 46: 8.747264517648235e-05, 47: 8.65896588575137e-05, 48: 8.643322569777624e-05, 49: 8.542420745803129e-05, 50: 8.499627822618716e-05, 51: 8.423006517753538e-05, 52: 8.400650521127758e-05}} )
### preprocessing steps to get data in a proper format
# rescaling the tof
tr_df['tof'] = tr_df['tof']/1e6 # to get time in seconds
# calculating the step betweeb the points byt hte TOF
tr_df['d_tof'] = tr_df['tof'] - tr_df['tof'].shift(1)
N = tr_df.shape[0] #number of points in the signal
# time step calculation - dt
min_timebin = abs(tr_df['d_tof'].min())
max_timebin = abs(tr_df['d_tof'].max())
std_timebin = abs(tr_df['d_tof'].std())
timebin_avg = abs(tr_df['d_tof'].mean())
print(min_timebin, timebin_avg, max_timebin, std_timebin)
# let's assume that it's the mean value because we have some discretization & computational errors
timebin = round(abs(tr_df['d_tof'].mean())*1e7, 0)/1e7 # average rounded step by tof between each point
t0 = tr_df['tof'].min()
dt = timebin
print('dt = ', dt)
print('t0 = ', t0)
# time array
time_ar = np.arange(0, N) * dt + t0
# signal data array
signal_ = tr_df['exp_trans'].to_numpy()
# plotting the signal
fig, ax = plt.subplots(figsize=(8, 3))
ax.plot(tr_df['tof'], tr_df['exp_trans'], label='signal')
#plotting the positions of the peaks
for index, row in ladder_df.iterrows():
ax.axvline(x = row['tof'], color = 'r', linestyle = 'dashed', linewidth = 1.0, alpha=0.5)
ax.set_xlim([time_ar[0], time_ar[-1]])
ax.set_ylabel('Signal Amplitude', fontsize=10)
ax.set_title('Signal', fontsize=10)
ax.set_xlabel('Time', fontsize=10)
ax.legend()
ax.grid()
plt.show()
# for wavelet plotting
cmap = plt.cm.seismic
# wavelet calculation
scales = np.arange(1, 64) # what are these scales??
[coefficients, frequencies] = pywt.cwt(signal_, scales, 'cmor', dt)
pwr = (abs(coefficients)) ** 2
periods = 1. / frequencies
levels = [0.015625, 0.03125, 0.0625, 0.125, 0.25, 0.5, 1, 2, 4, 8, 16, 32] # what are these levels?
contourlevels = np.log2(levels)
fig, ax = plt.subplots(figsize=(8, 5))
im = ax.contourf(time_ar, np.log2(periods), np.log2(pwr), contourlevels, extend='both', cmap=cmap)
plt.show()
P.S.The actual task is to identify the resonances in the noised signal. The hypothesis is that real and fake resonances (or random noise) differ in a spectrum or in some parameters that can help to identify resonances in the noisy data or to use that information to automate the identification process. (I have a lot of experimental signals which are evaluated - with the known positions of real resonances so the next step is to automate identification of resonances in experimental data using ML-techniques).