3

I have the data that looks like this:

   topic  positive  negative     type
0     88  0.080000  0.030000   source
1     36  0.010000  0.200000   source
2    101  0.350000  0.040000   source
3     78  0.110000  0.090000   source
4     99  0.110000  0.010000   source
5     79  0.000000  0.050000   source
6     24  0.000000  0.160000   source
7     17  0.000000  0.410000   source
8     14  0.090000  0.050000   source
9     29  0.060000  0.030000   source
0     14  0.207071  0.085859  summary
1     17  0.000000  0.738889  summary
2     24  0.000000  0.219349  summary
3     29  0.000000  0.094907  summary
4     36  0.000000  0.255808  summary
5     78  0.108333  0.194444  summary
6     79  0.000000  0.106443  summary
7     88  0.089286  0.041667  summary
8     99  0.098496  0.050877  summary
9    101  0.444444  0.055556  summary

I need to draw a bar plot that compares positive/negative values for different type for each topic. I see it like stacked (positive/negative) barplot with topic on x axis and bars are grouped using type column. But I could not find a way to build both grouped and stacked bar plot.

For single type in looks like this (sorry I don't have enough reputation to post images):

polar_data.set_index(['type', 'topic']).xs('summary').plot(kind='bar', stacked=True)

And the only way I currently could compare two different types is only by placing two plots side by side using seaborn.factorplot, which doesn't allow to clearly notice the trends. And also I don't know how to build stacked bar plot with seaborn.

print_data = pd.melt(polar_data, id_vars=['topic', 'type'], value_name='percent', var_name='polarity')
sns.factorplot("topic", 'percent', 'polarity', row="type", data=print_data, margin_titles=True, kind='bar')

So it there a way to "merge" them instead of place side by side?

AAzza
  • 65
  • 1
  • 5
  • I'm confused a bit here. How do you intend to compare positive and negative values? – Zero Apr 21 '15 at 11:12
  • @JohnGalt Let me explain. I want plot like this [here](http://www.mathworks.com/matlabcentral/fileexchange/screenshots/5734/original.jpg) Negative/Positive values are stacked on top of each other and there is two such bars for each topic (one is 'summary' type and other is 'source' type) they should be grouped near each other. So there is one group for each topic – AAzza Apr 21 '15 at 11:20

1 Answers1

5

Here's one way to do it using matplotlib. I guess seaborn would use the same structure.

In [3]: polar_data.pivot('topic', 'type')
Out[3]:
       positive            negative
type     source   summary    source   summary
topic
14         0.09  0.207071      0.05  0.085859
17         0.00  0.000000      0.41  0.738889
24         0.00  0.000000      0.16  0.219349
29         0.06  0.000000      0.03  0.094907
36         0.01  0.000000      0.20  0.255808
78         0.11  0.108333      0.09  0.194444
79         0.00  0.000000      0.05  0.106443
88         0.08  0.089286      0.03  0.041667
99         0.11  0.098496      0.01  0.050877
101        0.35  0.444444      0.04  0.055556

So, now for positive values you could do -

polar_data.pivot('topic', 'type')['positive'].plot(kind='bar', stacked=True)

enter image description here

For, negative values you could do -

polar_data.pivot('topic', 'type')['negative'].plot(kind='bar', stacked=True)

enter image description here

Zero
  • 74,117
  • 18
  • 147
  • 154