I am trying to parse a csv file using a code like: (I have updated the code at the bottom. Please have a look)
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from matplotlib import cm
cmap = cm.get_cmap('tab20')
pd.set_option('display.max_columns', 4)
# pd.set_option('display.max_rows', None)
dataset = pd.read_csv("mwe.csv")
dsc = dataset.columns
print(dataset.value_counts(subset=[dsc[1]]).reset_index(name='count'))
print(dataset.value_counts(subset=[dsc[1], dsc[0]]).reset_index(name='count'))
and a sample mwe is:
Quartile,KeyArea
Q2,Earth
Q1,Earth
Q1,Fire
,Fire
Q3,Fire
Q1,Space
Q3,Space
Q1,Space
Q4,Space
Q2,Space
,Space
Q2,Air
Q1,Air
Q1,Air
Q1,Air
,Air
Q2,Water
Q2,Water
Q1,Water
The output of the code with the data is:
#Total
KeyArea count
0 Air 5
1 Space 5
2 Fire 3
3 Water 3
4 Earth 2
#Keyarea split by Q1..Q4.
KeyArea Quartile count
0 Air Q1 3
1 Space Q1 2
2 Water Q2 2
3 Air Q2 1
4 Earth Q1 1
5 Earth Q2 1
6 Fire Q1 1
7 Fire Q3 1
8 Space Q2 1
9 Space Q3 1
10 Space Q4 1
11 Water Q1 1
Now, I am trying to get a stack bar plot with something like:
ax.bar(labels, q1, label='Q1')
ax.bar(labels, q2, bottom=q1, label='Q2')
ax.bar(labels, extra, bottom=q1 + q2, label='Others', color='C3')
i.e. for each keyarea, stack will be like keyarea in q1, keyarea in q2, total keyarea-keyarea in q1+q2.
How I can do that?
Update:
I have updated the code with:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from matplotlib import cm
cmap = cm.get_cmap('tab20')
pd.set_option('display.max_columns', 4)
# pd.set_option('display.max_rows', None)
dataset = pd.read_csv("mwe.csv")
dsc = dataset.columns
dataset[dsc[1]].fillna("None", inplace=True)
print(dict(dataset.value_counts(subset=[dsc[0], dsc[1]]).sort_index()))
Which is giving the output:
{('Air', 'Q1'): 3, ('Air', 'Q2'): 1, ('Earth', 'Q1'): 1, ('Earth', 'Q2'): 1, ('Fire', ' Q1'): 1, ('Fire', 'Q3'): 1, ('Space', 'Q1'): 2, ('Space', 'Q2'): 1, ('Space', 'Q3'): 1, ('Space', 'Q4'): 1, ('Water', 'Q1'): 1, ('Water', 'Q2'): 2}
Now, I have to create a stack, say for Space. I need first stack for Q1, i.e. for ('Space', 'Q1'):2
, second for (Space,Q2), and third with None+Q3+Q4
.
Kindly help
NB:This is a restatement of the question create stackbar from tw different pandas output, as I have managed a mwe code and dataset. I will close the question.