Suppose I have a df
like this:
stringOfInterest trend
0 C up
1 D down
2 E down
3 C,O up
4 C,P up
I want to plot this df
as a bar graph using pandas
. To obtain the proper grouped bar plots, I would like to group the data by the column df["trend"]
and then count the occurrence of df["stringOfInterest"]
for each letter.
As can be seen, some of this strings contain multiple letters separated by a ",".
Using
df.groupby("trend").stringOfInterest.value_counts().unstack(0)
produces the expected result:
trend down up
stringOfInterest
- 7.0 8.0
C 3.0 11.0
C,O NaN 2.0
C,P 1.0 1.0
D 1.0 2.0
E 15.0 14.0
E,T 1.0 NaN
However, I would like to count the occurrence of individual characters (C,E,D,...).
On the original df
this can be achieved like this:
s = df.stringOfInterest.str.split(",", expand = True).stack()
s.value_counts()
This typically generates something like this:
C 3
E 2
D 1
O 1
P 1
T 1
Unfortunately, this cannot be used here after the groupby()
in combination with unstack()
.
Maybe I am on the wrong track and some more elegant way would be preferred.
To clarify the plotting: For each letter (stringOfInterest), there must be two bars indicating the number of "up" and "down" trend(s).