0

NOTE: Solution Posted Below!!!

I have a time-indexed table with a column BLIP that has only two values "XX" and "YY". The goal is to show a count of "XX" and "YY" with "YY" being below the x axis. I'm trying to create the correct data structure from a pandas table using code from Wes McKenney's book on Data Analysis (pg 26 I think):

df = base_df.drop(columns=dropcols).set_index('Created')
group = ['f2','BLIP']
df0 = df_minus.groupby(group)
agg_counts = df0.size().unstack().fillna(0)
indexer = agg_counts.sum(1).argsort()
count_subset = agg_counts.take(indexer).copy()
table = count_subset.groupby('BLIP').resample('MS').count().unstack('BLIP')['BLIP']
chart = table.plot.bar(title = chart_title, x=None, color = ['green', 'red', 'grey']);

The line

agg_counts = df0.size().unstack().fillna(0) 

results in the following error:

TypeError: 'numpy.int32' object is not callable

I found this gem of a snippet here, but can't find the documentation to decypher it.

data['values'].plot(kind='bar', color=data.positive.map({True: 'g', False: 'r'}))

This seems like is would be very simple, but I'm quite wrapped about the axle on this.

Target Image

The pandas table format is something like

create_date f1 f2 f3 BLIP f5...
dt_stamp    X  Y  Z  XX   K1
dt_stamp    S  R  Y  YY   K3
dt_stamp    P  P  T  XX   K1

and so on.

Per Jesse's suggestion I tried

df_plus =df[df['BLIP']=='XX']
df_minus=df[df['BLIP']=='YY']

ax = plt.axes()
ax.bar(df_plus.index, df_plus['BLIP'], width=0.4, color='g')
ax.bar(df_neg.index, df_minus['BLIP'], width=0.4, color='r')
ax.autoscale()
plt.show()

This resulted in

ValueError: shape mismatch: objects cannot be broadcast to a single shape

Solution in its entirety

df = base_df
plt.clf()
fig = plt.figure()
width = 8
height = 6
fig.set_size_inches(width, height)
chart_title = 'YTD CR Trend Summary'
df_plus =df[df['BLIP'] == 'XX']
df_minus=df[df['BLIP']== 'IYY']
p =  df_plus.resample('MS').count()['BLIP'].fillna(0)
n = df_minus.resample('MS').count()['BLIP'].apply(lambda x: int(-x)) 
print(chart_title, len(df), p.sum(), n.sum())
plt.clf()
fig = plt.figure()
# ax = fig.add_subplot(1, 1, 1)
ax = plt.axes(label=chart_title) #label suppresses warning
if p.sum() != False:
    ax.bar(p.index, p, width=10, color='g') 
if n.sum() != False:
    ax.bar(n.index, n, width=10, color='r')
plt.suptitle(chart_title, fontsize=11)
filename = f'{graph_images_dir}{chart_title}.png'
print(f'Saving {filename}')
plt.savefig(filename,  bbox_inches='tight', pad_inches=0.5, dpi=200)
plt.show()
Harvey
  • 329
  • 2
  • 15
  • I would like to know if I can use the data.positive.map to directly map to the "XX" values in BLIP or if I have to create a new field with True and False values. – Harvey Jul 25 '18 at 21:04
  • Is seems as if I have to go from this 3NF dataframe to some sort of object that gives me an object with BLIP.value_counts by month and then I somehow need to map them to the plot properly, – Harvey Jul 25 '18 at 21:20

1 Answers1

1

You can plot it manually using matplotlib:

import matplotlib.pyplot as plt

ax = plt.axes()
ax.bar(table.index, table['XX'], width=0.4, color='g')
ax.bar(table.index, table['YY'], width=0.4, color='r')
Jesse Bakker
  • 2,403
  • 13
  • 25
  • I divided the table into two dataframes df_pos and df_neg, then tried ax.bar(table.index, df_pos['BLIP'], width=0.4, color='g') and got this error ValueError: shape mismatch: objects cannot be broadcast to a single shape – Harvey Jul 25 '18 at 21:13
  • That means your `table.index` does not have the same length as `df_pos['BLIP']`. You can use `df_pos.index` instead – Jesse Bakker Jul 25 '18 at 22:43
  • Solution posted above. Jesse suggestion is correct, but table['XX'] should be table['BLIP'] Warning: Do not code when sleep deprived!! – Harvey Jul 26 '18 at 18:49
  • Related Comments https://stackoverflow.com/questions/22311139/matplotlib-bar-chart-choose-color-if-value-is-positive-vs-value-is-negative and https://stackoverflow.com/questions/33476401/color-matplotlib-bar-chart-based-on-value – Harvey Jul 27 '18 at 17:37