I have the following data frame:
strterminationreason total_trials %Trials
0 Completed, Negative outcome/primary endpoint(s... 3130 6.390624
1 Completed, Outcome indeterminate 3488 7.121565
2 Completed, Outcome unknown 6483 13.236555
3 Completed, Positive outcome/primary endpoint(s... 15036 30.699498
4 Terminated, Business decision - Drug strategy ... 526 1.073952
5 Terminated, Business decision - Other 1340 2.735922
6 Terminated, Business decision - Pipeline repri... 1891 3.860917
7 Terminated, Early positive outcome 231 0.471640
8 Terminated, Lack of efficacy 1621 3.309649
9 Terminated, Lack of funding 533 1.088244
10 Terminated, Other 1253 2.558291
11 Terminated, Planned but never initiated 4441 9.067336
12 Terminated, Poor enrollment 3201 6.535587
13 Terminated, Safety/adverse effects 993 2.027441
14 Terminated, Unknown 4811 9.82277
I used the following code to plot a bar graph , horizontally since normal one doesn't fit the text codes above.
df['%Trials']=(df.ix[:,1]/sum(df.ix[:,1]))*100
plt.figure(figsize=(35,20))
plt.barh(df.ix[:,2],df.index,align='edge')
plt.xlim([0,31])
plt.yticks(df.index, df.strterminationreason)
plt.ylabel("TerminationReason",fontsize=20)
plt.xlabel("%Trials",fontsize=20)
But I get the output where the range of the bars doesn't reflect the actual % value in the dataframe. Like the highest % is for Completed, Positive outcome/primary endpoint but it doesn't show the same. Any idea why?
Also do someone know how to correctly fit the text under each bar so that there is no overlapping and is clean.