I have a df where each row represents a cell, the first column represents its celltype, the second column represents it is a common celltype or a rare one, and the last column represents the corresponding patient's name. Like this: the overview of the dataframe
I would like to plot a stack histplot to show the ratio of celltypes in each patient by using the df described above, here is my toy code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from random import choice
celltype_name = [choice(['Myeloid','B','T','NK','CAF','pDC']) for i in range(100)]
celltype2_name= [choice(['rare','common']) for i in range(100)]
patient_name = [choice(['P01','P02','P03','P04']) for i in range(100)]
df = pd.DataFrame([celltype_name, celltype2_name, patient_name]).T
df.columns = ['celltype', 'celltype2', 'patient_name']
df['patient_name'] = pd.Categorical(df['patient_name'], ['P01','P02','P03','P04'])
ax = sns.histplot(
data=df,
x="patient_name", hue="celltype",
multiple="fill", stat="proportion",
discrete=True, shrink=.8, legend=True,
linewidth=0.2,
)
sns.move_legend(ax, "upper left", bbox_to_anchor=(1, 1))
plt.show()
And I get the output figure:
However, I would like to have two columns for each patient in the plot, while the first column denotes the ratio of the common celltypes and the second denotes the ratio of the rare celltypes, like this: