1

I have 2 columns: R_nom = actual value, R_mes = meausred value. I have :

  • Computed a linear fit
  • Computed % error
  • Computed std

I want to do a log-log plot with:

  • A scatter of the measuered values (DONE)
  • The linear Fit (DONE)
  • Error lines above and below the fitted line to indicate the are we expect the value to be in. (TO DO)

This is my code so far:

#@title stack example 
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import matplotlib.pyplot as plt


import numpy as np
import pandas as pd

R_Nom=[100, 100, 100, 100, 500, 500, 500, 500, 500, 800, 800, 800, 800, 1200, 1200, 1200, 1200, 1200, 1200, 5500, 5500,5500,5500,5500]
R_Mes=[102, 101, 98, 97, 512, 517, 492, 490, 470, 820, 825, 812, 790, 1250, 1240, 1230, 1212, 1175, 1153, 5600,5574,5440,5390,5612]

df= pd.DataFrame({"R_Nom": R_Nom, "R_Mes": R_Mes})

 
 # finding the mean, max, min, std using groupby 
mean= df.groupby('R_Nom')['R_Mes'].mean()
max_ = df.groupby('R_Nom')['R_Mes'].max()
min_ = df.groupby('R_Nom')['R_Mes'].min()
std = df.groupby('R_Nom')['R_Mes'].std()

 # for error% however i have to include extra steps - can it be written as above? 
df['Error_percentage'] = abs((df["R_Mes"]-df["R_Nom"])/df["R_Nom"])*100
grouped_df = df.groupby("R_Nom")
max_error = grouped_df['Error_percentage'].max()


 #output the data as a dataframe for plotting 
summary = pd.concat({"mean":mean, 
        "min": min_ ,
        "max": max_ ,
        "std": std, 
        "max_error %": max_error}, axis=1)

summary



 #fit the data


x=np.log10(df['R_Nom'])
y= np.log10(df['R_Mes'])


x_fit = np.linspace(min(x), max(x), 100)


model = np.polyfit(x, y, 1, full=True)
fig = make_subplots(specs=[[{"secondary_y": True}]])


fig.add_trace(
      go.Scatter(x=x_fit, y=x_fit*model[0][0]+model[0][1], name="Fit"),
                secondary_y=False 
  )
fig.add_trace(
      go.Scatter(x=x, y=y, name="Fit", mode="markers"),
                secondary_y=False 
  )

fig.add_trace(
      go.Scatter(x=np.log10(summary.index), y=summary['max_error %'], name="Error % "),
                secondary_y=True 
  )


fig.show()


I have manually added the lines that i mean, 2 lines that encompass the error area, not straight lines but rather change depending on what % or std value is at that precise R_Nom value

Error lines

Secondly - is there other methods of estimated the error of my data that could be useulf? R_squared etc?

The end goal is to evaluate what is the minimum change of R_Nom that i can detect

Mr. T
  • 11,960
  • 10
  • 32
  • 54
Leo
  • 1,176
  • 1
  • 13
  • 33
  • 1
    You mean like the [this](https://stackoverflow.com/a/43069856/8881141) or [this](https://stackoverflow.com/a/13157955/8881141)? – Mr. T Oct 20 '20 at 18:30
  • Looks Great, the only problem is that my error is defined only at the values where i measured it (eg: 100, 500)9 how would i connect the points inbetween (eg 270)? Since the erorr and variance at 100 isn't hte same as at 500, maybe need to do a weighted average of err(100) and err(500)? – Leo Oct 21 '20 at 07:40
  • That is now a stats question. I am sure, scipy has a function to generate an interpolation of the error bars for a smoother, nonlinear representation. I tagged `scipy` in, maybe someone else knows what's the best (i.e., correct) way to do this. – Mr. T Oct 21 '20 at 07:52

0 Answers0