I am analyzing the Iris dataset and made a scatterplot between the petal width and petal length. To make the plot I used this code :
# First, we'll import pandas, a data processing and CSV file I/O library
import pandas as pd
# We'll also import seaborn, a Python graphing library
import warnings # current version of seaborn generates a bunch of warnings that we'll ignore
warnings.filterwarnings("ignore")
import seaborn as sns
import matplotlib.pyplot as plt
import numpy
sns.set(style="dark", color_codes=True)
# Next, we'll load the Iris flower dataset, which is in the "../input/" directory
iris = pd.read_csv("Iris.csv") # the iris dataset is now a Pandas DataFrame
# Let's see what's in the iris data - Jupyter notebooks print the result of the last thing you do
print(iris.head(10))
# Press shift+enter to execute this cell
sns.FacetGrid(iris, hue="Species", size=10) \
.map(plt.scatter, "PetalLengthCm", "PetalWidthCm") \
.add_legend()
Afterwards I plotted a regression line but after plotting this line the colors aren't well visible. I tried to change the color of the regression line but this didn't help. How can I plot the regression line without loosing the color of the different species?
The code to make the plot that includes a regression line is :
sns.FacetGrid(iris, hue="Species", size=10) \
.map(plt.scatter, "PetalLengthCm", "PetalWidthCm") \
.add_legend()
sns.regplot(x="PetalLengthCm", y="PetalWidthCm", data=iris)
petal_length_array = iris["PetalLengthCm"]
petal_width_array = iris["PetalWidthCm"]
r_petal = numpy.corrcoef(petal_length_array, petal_width_array) # bereken de correlatie
print ("Correlation is : " + str(r_petal[0][1]))