1

I am analyzing the Iris dataset and made a scatterplot between the petal width and petal length. To make the plot I used this code :

# First, we'll import pandas, a data processing and CSV file I/O library
import pandas as pd
# We'll also import seaborn, a Python graphing library
import warnings # current version of seaborn generates a bunch of warnings that we'll ignore
warnings.filterwarnings("ignore")
import seaborn as sns
import matplotlib.pyplot as plt
import numpy
sns.set(style="dark", color_codes=True)

# Next, we'll load the Iris flower dataset, which is in the "../input/" directory
iris = pd.read_csv("Iris.csv") # the iris dataset is now a Pandas DataFrame

# Let's see what's in the iris data - Jupyter notebooks print the result of the last thing you do
print(iris.head(10))

# Press shift+enter to execute this cell
sns.FacetGrid(iris, hue="Species", size=10) \
   .map(plt.scatter, "PetalLengthCm", "PetalWidthCm") \
   .add_legend()

enter image description here

Afterwards I plotted a regression line but after plotting this line the colors aren't well visible. I tried to change the color of the regression line but this didn't help. How can I plot the regression line without loosing the color of the different species?

The code to make the plot that includes a regression line is :

sns.FacetGrid(iris, hue="Species", size=10) \
   .map(plt.scatter, "PetalLengthCm", "PetalWidthCm") \
   .add_legend()
sns.regplot(x="PetalLengthCm", y="PetalWidthCm", data=iris)

petal_length_array = iris["PetalLengthCm"]
petal_width_array = iris["PetalWidthCm"]

r_petal = numpy.corrcoef(petal_length_array, petal_width_array) # bereken de correlatie

print ("Correlation is : " + str(r_petal[0][1]))

enter image description here

Diziet Asahi
  • 38,379
  • 7
  • 60
  • 75
Mark Schuurman
  • 687
  • 1
  • 11
  • 25

2 Answers2

7

Your problem is that sns.regplot() draws all the points the same color, on top of the points with the different colors.

To avoid this, try calling regplot(..., scatter=False) to prevent the individual datapoints from being plotted. Check the documentation for regplot.

Diziet Asahi
  • 38,379
  • 7
  • 60
  • 75
  • It doesn't work as intented. regplot only considers one type of data point. Can you please show the complete method, I might be missing something. – JaySabir Oct 09 '20 at 23:39
  • @jaysabir I don't understand your comment. Please post a new question with a full description on your problem, including code and data – Diziet Asahi Oct 10 '20 at 05:34
  • I want to plot a regplot with hue. Just like this question. – JaySabir Oct 11 '20 at 08:05
  • I agree with @DizietAsahi and to spare you time, check https://stackoverflow.com/questions/47407173/seaborn-regression-plot-with-different-colors/ – My Work Feb 24 '21 at 19:35
1

If you are happy to have multiple regression lines, you can split your data and over-plot ...

iris = sns.load_dataset("iris")

fig, ax = plt.subplots() 
colors = ['darkorange', 'royalblue', '#555555']
markers = ['.', '+', 'x']

for i, value in enumerate(iris.species.unique()):
    ax = sns.regplot(x="petal_length", y="petal_width", ax=ax,
                     color=colors[i],
                     marker=markers[i], 
                     data=iris[iris.species == value],
                     label=value)

ax.legend(loc='best') 
display(fig) 
plt.close('all')

Iris plot with separate regressions for species

Mark Graph
  • 4,969
  • 6
  • 25
  • 37