I have a data set that contains 440 variables, and it is divided into three columns:
- Column 1 is called Simulations; this variable contains the names of four different simulations (indiv, ssm, bma, and real) to calculate some indicators. This variable is an object.
- Column 2 is called Scores and contains the values assigned by each simulation to each observation of the data set; the scores go from 1 to 4. The variable is a float64.
- Column three is called Ranking and contains who the observations are ranked according to their scores. The variable is an object
I am trying to combine a histogram of the hole population with the KDE plot of the variable real. So far this is my code:
import seaborn as sns
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
sns.__version__
ors=pd.read_excel('C:\Data\Book1.xlsx')
ors.shape
indiv=ors[ors.Simulation=='Individual weights']
subset1= ors[ors['Simulation'] == 'indiv']
ssm=ors[ors.Simulation=='ssm']
subset2= ors[ors['Simulation'] == 'ssm']
bma=ors[ors.Simulation=='bma']
subset3= ors[ors['Simulation'] == 'bma']
real=ors[ors.Simulation=='real']
subset4= ors[ors['Simulation'] == 'real']
sns.set_style('white')
sns.displot(x='Scores', data=ors)
This is the resulting histogram of the hole population, then I apply the following code to check for the kde of all the variables:
sns.displot(x='Scores', data=ors, kind='kde', hue='Simulation')
As a result, comes the following graph:
Now I am trying to combine the red kde with the histogram of the population of my data set, I was using the following command to do this, although I am not sure if this is the correct way to combine this graphs:
sns.displot(x='Scores', data=ors, hist= True, hist= False, subset4['Scores'], hist = False, kde = True,
kde_kws = {'linewidth': 3})
But I get this mistake
File "<ipython-input-36-dc49ac2c4ff6>", line 1
sns.displot(x='Scores', data=ors, hist= True, hist= False, subset4['Scores'], hist = False, kde = True,
^
SyntaxError: keyword argument repeated
Many thanks, Kind regards, Iván