11

I searched but I did not find the answer regrading the seaborn library. I also checked the documentation for lmplot() and regplot(), but did not find either. Is it possible to extend and control the length of regression lines? By default seaborn fits the length of regression line according to the length of x axis. Another option is to use argument truncate=True - that would limit the regression line only to the extent of data. Other options?

In my example I want the lower regression line to be extended down till x=0. And the upper line extended till the intersection with the lower one.

example

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

file = 'cobbles.csv'
df = pd.read_csv(file, sep=',')

sns.regplot(x='downward_temp', y='downward_heat', data=df, ci=None)
sns.regplot(x='upward_temp', y='upward_heat', data=df, ci=None, order=2)


plt.xlim([0,25])
plt.ylim([0,100])
plt.show()
Serenity
  • 35,289
  • 20
  • 120
  • 115
Kārlis Rieksts
  • 169
  • 1
  • 1
  • 8

4 Answers4

6

Short answer: You just have to add plt.xlim(start,end) before your Seaborn plots.


I guess it might make more sense for Seaborn to automatically determine the length from the plot limits.

The same issue brought me here, and @Serenity's answer inspired me that something like xlims = ax.get_xlim() might help.

May try fixing and commit a change to Seaborn afterwards.

Claire
  • 639
  • 9
  • 25
5

If you know your x limits prior to plotting, you can set_xlim for the axis before calling regplot and seaborn will then extend the regression line and the CI over the range of xlim.

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

file = 'cobbles.csv'
df = pd.read_csv(file, sep=',')

fig, ax = plt.subplots()

xlim = [0,25]
ax.set_xlim(xlim)

sns.regplot(x='downward_temp', y='downward_heat', data=df, ci=None, ax=ax)
sns.regplot(x='upward_temp', y='upward_heat', data=df, ci=None, order=2, ax=ax)

ax.set_ylim([0,100])
plt.show()
Jake Dunlea
  • 51
  • 1
  • 2
3

You have to use scipy.stats.linregress to calculate linear regression function like seaborn do. Then you have to generate x array to cover new x axis limits of canvas and plot on it extended regression line. For details looks at the example:

import numpy as np; np.random.seed(8)

import seaborn as sns
import matplotlib.pylab as plt
import scipy.stats

# test data
mean, cov = [4, 6], [(1.5, .7), (.7, 1)]
x, y = np.random.multivariate_normal(mean, cov, 80).T
ax = sns.regplot(x=x, y=y, color="g")

# extend the canvas
plt.xlim([0,20])
plt.ylim([0,15])

# calculate linear regression function
slope, intercept, r_value, p_value, std_err = \
 scipy.stats.linregress(x=x,y=y)

# plot the regression line on the extended canvas
xlims = ax.get_xlim()
new_x = np.arange(xlims[0], xlims[1],(xlims[1]-xlims[0])/250.)
ax.plot(new_x, intercept + slope *  new_x, color='g', linestyle='-', lw = 2.5)

plt.show()

enter image description here

Serenity
  • 35,289
  • 20
  • 120
  • 115
  • 1
    Thanks. This was also my idea to get around the problem. But unfortunately this boils down to a scatter plot coupled with a line. There is almost no point of seaborn in this case. But what if the fitting line is a power curve or log function or other type of possible functions? – Kārlis Rieksts Apr 11 '17 at 09:31
  • 1
    Look for numpy.polyfit or scipy.optimize.curve_fit. Here is very good answer: http://stackoverflow.com/a/3433503/2666859 – Serenity Apr 11 '17 at 09:34
3

Setting truncate to False does the job. Adding this answer here, might help someone

Joyce John
  • 61
  • 4