0

I want to analyze the data of a crack-meter (measures the aperture of a crack in the ground through the time). I have the temperature data from a nearby sensor. I have stored them as time-indexed pandas.

When plotting the data it is easy to see that both are correlated. Therefore the temperature is influencing the aperture of the crack.

Plot Crack apeture vs Temperature

I have plotted some comparative of the data using an scatter plot (Just used the data of 2023 because the correlation is more clear on that months).

Scatter comparative between data

The aim is to remove the fluctuation in the aperture that it is caused by temperature fluctuations. With that we will be able to analyze the evolution of the aperture that is "independent" of the temperature fluctuations.

I share the January 2023 data. If more than one month of data is required, I can share more months.

Thank you in advance.

import pandas as pd
import numpy as np

df_crack = pd.DataFrame({'date': ['2023-01-01 00:00:00', '2023-01-02 00:00:00', 
                          '2023-01-03 00:00:00', '2023-01-04 00:00:00',
                          '2023-01-05 00:00:00', '2023-01-06 00:00:00',
                          '2023-01-07 00:00:00', '2023-01-08 00:00:00',
                          '2023-01-09 00:00:00', '2023-01-10 00:00:00',
                          '2023-01-11 00:00:00', '2023-01-12 00:00:00',
                          '2023-01-13 00:00:00', '2023-01-14 00:00:00',
                          '2023-01-15 00:00:00', '2023-01-16 00:00:00',
                          '2023-01-17 00:00:00', '2023-01-18 00:00:00',
                          '2023-01-19 00:00:00', '2023-01-20 00:00:00',
                          '2023-01-21 00:00:00', '2023-01-22 00:00:00',
                          '2023-01-23 00:00:00', '2023-01-24 00:00:00',
                          '2023-01-25 00:00:00', '2023-01-26 00:00:00',
                          '2023-01-27 00:00:00', '2023-01-28 00:00:00',
                          '2023-01-29 00:00:00', '2023-01-30 00:00:00',
                          ], 
               'aperture': [0.452762281,0.372262281,0.513928948,0.447762281,
                            0.377095615,0.355095615,0.271428948,0.291762281,
                            0.476762281,0.335928948,0.280428948,0.283762281,
                            0.322928948,0.287262281,0.316928948,0.209262281,
                            0.407928948,0.254262281,0.232095615,0.264262281,
                            0.076095615,-0.025237719,-0.042237719,-0.094904385,
                            0.017428948,-0.036071052,-0.094071052,-0.071404385,
                            0.008095615,-0.141571052]})

df_crack['date'] = pd.to_datetime(df_crack['date'])
df_crack = df_crack.set_index('date')

df_temp = pd.DataFrame({'date': ['2023-01-01 00:00:00', '2023-01-02 00:00:00', 
                          '2023-01-03 00:00:00', '2023-01-04 00:00:00',
                          '2023-01-05 00:00:00', '2023-01-06 00:00:00',
                          '2023-01-07 00:00:00', '2023-01-08 00:00:00',
                          '2023-01-09 00:00:00', '2023-01-10 00:00:00',
                          '2023-01-11 00:00:00', '2023-01-12 00:00:00',
                          '2023-01-13 00:00:00', '2023-01-14 00:00:00',
                          '2023-01-15 00:00:00', '2023-01-16 00:00:00',
                          '2023-01-17 00:00:00', '2023-01-18 00:00:00',
                          '2023-01-19 00:00:00', '2023-01-20 00:00:00',
                          '2023-01-21 00:00:00', '2023-01-22 00:00:00',
                          '2023-01-23 00:00:00', '2023-01-24 00:00:00',
                          '2023-01-25 00:00:00', '2023-01-26 00:00:00',
                          '2023-01-27 00:00:00', '2023-01-28 00:00:00',
                          '2023-01-29 00:00:00', '2023-01-30 00:00:00',
                          ], 
               'temperature': [9.6,8,8.4,6.2,6.2,6,3.9,8.5,8.3,5.3,5.6,5.3,
                               6.2,6.3,6.9,4.8,6.7,3.6,3,4.6,2.3,1.3,1,0.3,
                               1.6,0.4,1.5,1.4,2.2,1.2]})

df_temp['date'] = pd.to_datetime(df_temp['date'])
df_temp = df_temp.set_index('date')

January 2023 plot data

Ferran
  • 13
  • 3
  • 3
    Hi, welcome to StackOverflow. Please take the [tour](https://stackoverflow.com/tour) and learn [How to Ask](https://stackoverflow.com/help/how-to-ask). In order to get help, you will need to provide a [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example). – alec_djinn May 09 '23 at 09:39
  • If your question include a pandas dataframe, please provide a [reproducible pandas example](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – alec_djinn May 09 '23 at 09:39
  • 2
    What's your question exactly? You can't really subtract a temperature from a distance. But you can test for correlation and take that into account. However, this is not a programming question, so not a good fit here. Maybe https://stats.stackexchange.com/ is a better choice. – alec_djinn May 09 '23 at 09:42
  • Can you provide the data please? – Corralien May 09 '23 at 10:25
  • Thanks, I share 1 month data, if more data would be useful I can share more. – Ferran May 09 '23 at 10:53

1 Answers1

0

EDIT

Now, I don't think there is an obvious relationship between aperture size and temperature. If we take the moving average at 15 or 30 days and plot, it appears that for a linear temperature, the size of the opening varies a lot (look at the average temperature of 8°C)

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Read data and fill nan (the method doesn't matter, there are 5 missing values)
df = pd.read_csv('crackmeter.csv', index_col='date', parse_dates=['date'])
df['aperture'] = df['aperture'].fillna(df.groupby(df['temperature'].round())['aperture'].transform('mean'))

g = sns.scatterplot(df.rolling('30D').mean().to_period('M'), x='aperture', y='temperature', hue='date')
g.axes.axhline(8)
plt.show()

enter image description here


It's clearly not a programming question. The temperature dataframe is probably not relevant here because the aperture already depends on the temperature. I'm not specialized on time series analysis but you should look about seasonal decomposition.

If you consider your aperture is the sum (additive model) of 3 components: Trend, Seasonal (temperature) and Residual, you can use seasonal_decompose from statsmodels:

from statsmodels.tsa.seasonal import seasonal_decompose

crack = seasonal_decompose(df_crack['aperture'])
crack.plot()
plt.show()
out = pd.concat([df_crack, crack.trend, crack.seasonal, crack.resid], axis=1)

Output:

enter image description here

>>> out
            aperture     trend  seasonal     resid
date                                              
2023-01-01  0.452762       NaN -0.035330       NaN
2023-01-02  0.372262       NaN  0.004456       NaN
2023-01-03  0.513929       NaN  0.027567       NaN
2023-01-04  0.447762  0.398619  0.021857  0.027286
2023-01-05  0.377096  0.375619  0.001988 -0.000512
2023-01-06  0.355096  0.390548  0.018172 -0.053625
2023-01-07  0.271429  0.365119 -0.038711 -0.054980
2023-01-08  0.291762  0.341215 -0.035330 -0.014123
2023-01-09  0.476762  0.327881  0.004456  0.144425
2023-01-10  0.335929  0.323286  0.027567 -0.014924
2023-01-11  0.280429  0.325548  0.021857 -0.066976
2023-01-12  0.283762  0.329143  0.001988 -0.047369
2023-01-13  0.322929  0.290929  0.018172  0.013828
2023-01-14  0.287262  0.301215 -0.038711  0.024758
2023-01-15  0.316929  0.297477 -0.035330  0.054782
2023-01-16  0.209262  0.290096  0.004456 -0.085289
2023-01-17  0.407929  0.281715  0.027567  0.098647
2023-01-18  0.254262  0.251548  0.021857 -0.019143
2023-01-19  0.232096  0.202667  0.001988  0.027441
2023-01-20  0.264262  0.166738  0.018172  0.079351
2023-01-21  0.076096  0.094905 -0.038711  0.019901
2023-01-22 -0.025238  0.061072 -0.035330 -0.050980
2023-01-23 -0.042238  0.022762  0.004456 -0.069456
2023-01-24 -0.094904 -0.028428  0.027567 -0.094043
2023-01-25  0.017429 -0.049500  0.021857  0.045072
2023-01-26 -0.036071 -0.044738  0.001988  0.006679
2023-01-27 -0.094071 -0.058928  0.018172 -0.053315
2023-01-28 -0.071404       NaN -0.038711       NaN
2023-01-29  0.008096       NaN -0.035330       NaN

Maybe you can consider the resid component as the result of the aperture without the temperature part?

So, you should ask your question on Cross Validated forum.

Corralien
  • 109,409
  • 8
  • 28
  • 52
  • Thank you for your answer, The aperture depends on the temperature effect as well as other independent parameters. The seasonal decomposition is a good idea but it is not exactly what I was looking for. As it takes the time as the main characteristic to decompose the data. As the temperature is different in winter and summer that cloud a first approx. But the correlation between temperature and aperture is stronger that just the big scale effect. I was hoping to find the equation that translates the variation on temperature to its effect on the aperture. – Ferran May 09 '23 at 14:47
  • Using that information, remove the effect of the temperature to obtain the variations on the aperture that are caused by the other independent parameters. If you still think that is a question for a Cross Validated forum I will post it there. – Ferran May 09 '23 at 14:49
  • Can you provide the full data, please? – Corralien May 09 '23 at 16:11
  • How can I share the full data? I don't see any option to upload a CSV. I'm quite new, thanks for the understanding. – Ferran May 10 '23 at 08:07
  • Use google drive, wetransfer, dropbox, etc – Corralien May 10 '23 at 09:12
  • [download crackmeter.csv](https://wetransfer.com/downloads/aecb8acc0ebe06a90c994484e903f77220230510100239/95cb9e) Thank you – Ferran May 10 '23 at 10:05
  • Sorry for my late response. I can't find any valid answer to your problem but I'm sure the seasonal part of the decomposition might be the key, especially if it's the only seasonal parameter. – Corralien May 13 '23 at 05:38
  • Thank you for your time and efforts. I will try harder with the seasonal decomposition. – Ferran May 15 '23 at 14:20
  • Just for my information, what do you think about the moving average (30 days) and the aperture for a mean temperature of 8°C? Does it make sense or not? – Corralien May 15 '23 at 14:32