0

I have the following data frame.

                  _id                               message_date_time         country
0   {'$oid': '61f7dfd24b11720cdbda5c86'}    {'$date': '2021-12-24T12:30:09Z'}   RUS
1   {'$oid': '61f7eb7b4b11720cdbda9322'}    {'$date': '2021-12-20T21:58:20Z'}   RUS
2   {'$oid': '61f7fdad4b11720cdbdb0beb'}    {'$date': '2021-12-15T15:29:13Z'}   RUS
3   {'$oid': '61f8234f4b11720cdbdbec52'}    {'$date': '2021-12-10T00:03:43Z'}   USA
4   {'$oid': '61f82c274b11720cdbdc21c7'}    {'$date': '2021-12-09T15:10:35Z'}   USA

With these values

df["country"].value_counts()

RUS    156
USA    139
FRA     19
GBR     11
AUT      9
AUS      8
DEU      7
CAN      4
BLR      3
ROU      3
GRC      3
NOR      3
NLD      3
SWE      2
ESP      2
CHE      2
POL      1
HUN      1
DNK      1
ITA      1
ISL      1
BIH      1
Name: country, dtype: int64

I'm trying to plot using the country and frequency of it using the following:

plt.figure(figsize=(15, 8))
plt.xlabel("Frequency")
plt.ylabel("Country")
plt.hist(df["country"])
plt.show()

enter image description here

What I need is to show the country frequency above every bar and keep a very small space between the bars.

  • You should use bar() not hist(). check this link for labels: https://www.delftstack.com/howto/matplotlib/add-value-labels-on-matplotlib-bar-chart/#:~:text=To%20add%20value%20labels%20on%20a%20Matplotlib%20bar%20chart%2C%20we,any%20location%20in%20the%20graph. – nfn Apr 26 '22 at 09:55

2 Answers2

1

For this I have used countplot from seaborn as it's better for checking the counts of each object in a series.

plt.figure(figsize = (20,5))
bars = plt.bar(df["country"], df["counts"])
for bar in bars.patches:
  plt.annotate(s = bar.get_height(), xy = (bar.get_x() + bar.get_width() / 2, bar.get_height()), va = "bottom", ha = "center")
plt.show()

The output should be something like this,

enter image description here

If you want something else to be on the graph instead of the height, just change the s parameter in the annotate function to a value of your choice.

Zero
  • 1,800
  • 1
  • 5
  • 16
  • I appreciate the effort! But I'm really asked to use Matplotlib. I can't adjust it using `plt.bar()` I can't show the bar's label right. –  Apr 26 '22 at 10:09
  • @LanaStance Check the updated answer. It uses only `matplotlib` now. – Zero Apr 26 '22 at 11:50
1

Arguably the easiest way it to use plt.bar(). For example:

counts = df["country"].value_counts()
names, values = counts.index.tolist(), counts.values.tolist()
plt.bar(names, values)
height_above_bar = 0.05  # distance of count from bar
fontsize = 12  # the fontsize that you want the count to have
for i, val in enumerate(values):
    plt.text(i, val + height_above_bar, str(val), fontsize=12)
plt.show()
Andrew
  • 84
  • 2
  • 11
  • Thanks, @Andrew but how do I show the count of the country over every bar? That's my problem. –  Apr 26 '22 at 10:05
  • Try to follow these instructions https://www.geeksforgeeks.org/how-to-annotate-bars-in-barplot-with-matplotlib-in-python/ – andrea m. Apr 26 '22 at 10:09
  • @LanaStance Try my method. It works well enough. – Zero Apr 26 '22 at 11:39