Take a look at Understanding Q-Q Plots for a concise description of what a QQ plot is. In your case, this particular part is important:
If both sets of quantiles came from the same distribution, we should
see the points forming a line that’s roughly straight.
This theoretical one-to-one relationship is illustrated explicitly in your plot using the red line.
And regarding your question...
that points effect for my model?
... one or both points that occur far from that red line could be conisered to be outliers. This means that whatever model you've tried to build here does not capture the properties of those tho observations. If what we're looking at here is a QQ plot of the residuals from a regression model, you should take a closer look at those two observations. What is it with these two that make them stand out from the rest of your sample? One way to "catch" these outliers is often to represent them with one or two dummy variables.
Edit 1: Basic approach for outliers and dummy variables
Since you haven't explicitly labeled your question sklearn
I'm taking the liberty to illustrate this using statsmodels
. And in lieu of a sample of your data, I'll just use the built-in iris
dataset where the last part of what we'll use looks like this:

1. Linear regression of sepal_width on sepal_length
Plot 1:

Looks good! Nothing wrong here. But let's mix it up a bit by adding some extreme values to the dataset. You'll find a complete code snippet at the end.
2. Introduce an outlier
Now, lets add a line in the dataframe where ``sepal_width = 8instead of
3`.
This will give you the following qqplot with a very clear outlier:

And here's a part of the model summary:
===============================================================================
coef std err t P>|t| [0.025 0.975]
-------------------------------------------------------------------------------
sepal_width 1.8690 0.033 57.246 0.000 1.804 1.934
==============================================================================
Omnibus: 18.144 Durbin-Watson: 0.427
Prob(Omnibus): 0.000 Jarque-Bera (JB): 7.909
Skew: -0.338 Prob(JB): 0.0192
Kurtosis: 2.101 Cond. No. 1.00
==============================================================================
So why is this an outlier? Because we messed with the dataset. The reason for the outliers in your dataset is impossible for me to determine. In our made-up example the reason for a setosa iris to have a sepal width if 8 could be many. Maybe the scientist labeled it wrong? Maybe it isn't a setosa at all? Or maybe it has been genetically modified? Now, instead of just discarding this observation from the sample, it's usually more informative to keep it where it is, accept that there is something special with this observation, and illustrate exactly that by including a dummy variable that is 1
for that observation and 0
for all other. Now the last part of your dataframe should look like this:

3. Identify the outlier using a dummy variable
Now, your qqplot will look like this:

And here's your model summary:
=================================================================================
coef std err t P>|t| [0.025 0.975]
---------------------------------------------------------------------------------
sepal_width 1.4512 0.015 94.613 0.000 1.420 1.482
outlier_dummy -6.6097 0.394 -16.791 0.000 -7.401 -5.819
==============================================================================
Omnibus: 1.917 Durbin-Watson: 2.188
Prob(Omnibus): 0.383 Jarque-Bera (JB): 1.066
Skew: 0.218 Prob(JB): 0.587
Kurtosis: 3.558 Cond. No. 27.0
==============================================================================
Notice that the inclusion of a dummy variable changes the coefficient estimate for sepal_widht
, and also the values for Skewness
and Kurtosis
. And that's the short version of the effects an outlier will have on your model.
Complete code:
import numpy as np
import pandas as pd
import statsmodels.api as sm
from matplotlib import pyplot as plt
import seaborn as sns
# sample data
df = pd.DataFrame(sns.load_dataset('iris'))
# subset of sample data
df=df[df['species']=='setosa']
# add column for dummy variable
df['outlier_dummy']=0
# append line with extreme value for sepal width
# as well as a dummy variable = 1 for that row.
df.loc[len(df)] = [5,8,1.4, 0.3, 'setosa', 1]
# define independent variables
x=['sepal_width', 'outlier_dummy']
# run regression
mod_fit = sm.OLS(df['sepal_length'], df[x]).fit()
res = mod_fit.resid
fig = sm.qqplot(res)
plt.show()
mod_fit.summary()