I want to analyze the data of a crack-meter (measures the aperture of a crack in the ground through the time). I have the temperature data from a nearby sensor. I have stored them as time-indexed pandas.
When plotting the data it is easy to see that both are correlated. Therefore the temperature is influencing the aperture of the crack.
Plot Crack apeture vs Temperature
I have plotted some comparative of the data using an scatter plot (Just used the data of 2023 because the correlation is more clear on that months).
Scatter comparative between data
The aim is to remove the fluctuation in the aperture that it is caused by temperature fluctuations. With that we will be able to analyze the evolution of the aperture that is "independent" of the temperature fluctuations.
I share the January 2023 data. If more than one month of data is required, I can share more months.
Thank you in advance.
import pandas as pd
import numpy as np
df_crack = pd.DataFrame({'date': ['2023-01-01 00:00:00', '2023-01-02 00:00:00',
'2023-01-03 00:00:00', '2023-01-04 00:00:00',
'2023-01-05 00:00:00', '2023-01-06 00:00:00',
'2023-01-07 00:00:00', '2023-01-08 00:00:00',
'2023-01-09 00:00:00', '2023-01-10 00:00:00',
'2023-01-11 00:00:00', '2023-01-12 00:00:00',
'2023-01-13 00:00:00', '2023-01-14 00:00:00',
'2023-01-15 00:00:00', '2023-01-16 00:00:00',
'2023-01-17 00:00:00', '2023-01-18 00:00:00',
'2023-01-19 00:00:00', '2023-01-20 00:00:00',
'2023-01-21 00:00:00', '2023-01-22 00:00:00',
'2023-01-23 00:00:00', '2023-01-24 00:00:00',
'2023-01-25 00:00:00', '2023-01-26 00:00:00',
'2023-01-27 00:00:00', '2023-01-28 00:00:00',
'2023-01-29 00:00:00', '2023-01-30 00:00:00',
],
'aperture': [0.452762281,0.372262281,0.513928948,0.447762281,
0.377095615,0.355095615,0.271428948,0.291762281,
0.476762281,0.335928948,0.280428948,0.283762281,
0.322928948,0.287262281,0.316928948,0.209262281,
0.407928948,0.254262281,0.232095615,0.264262281,
0.076095615,-0.025237719,-0.042237719,-0.094904385,
0.017428948,-0.036071052,-0.094071052,-0.071404385,
0.008095615,-0.141571052]})
df_crack['date'] = pd.to_datetime(df_crack['date'])
df_crack = df_crack.set_index('date')
df_temp = pd.DataFrame({'date': ['2023-01-01 00:00:00', '2023-01-02 00:00:00',
'2023-01-03 00:00:00', '2023-01-04 00:00:00',
'2023-01-05 00:00:00', '2023-01-06 00:00:00',
'2023-01-07 00:00:00', '2023-01-08 00:00:00',
'2023-01-09 00:00:00', '2023-01-10 00:00:00',
'2023-01-11 00:00:00', '2023-01-12 00:00:00',
'2023-01-13 00:00:00', '2023-01-14 00:00:00',
'2023-01-15 00:00:00', '2023-01-16 00:00:00',
'2023-01-17 00:00:00', '2023-01-18 00:00:00',
'2023-01-19 00:00:00', '2023-01-20 00:00:00',
'2023-01-21 00:00:00', '2023-01-22 00:00:00',
'2023-01-23 00:00:00', '2023-01-24 00:00:00',
'2023-01-25 00:00:00', '2023-01-26 00:00:00',
'2023-01-27 00:00:00', '2023-01-28 00:00:00',
'2023-01-29 00:00:00', '2023-01-30 00:00:00',
],
'temperature': [9.6,8,8.4,6.2,6.2,6,3.9,8.5,8.3,5.3,5.6,5.3,
6.2,6.3,6.9,4.8,6.7,3.6,3,4.6,2.3,1.3,1,0.3,
1.6,0.4,1.5,1.4,2.2,1.2]})
df_temp['date'] = pd.to_datetime(df_temp['date'])
df_temp = df_temp.set_index('date')