So I have been working on this for a bit, and just wanted to see if someone could look at why I could to auto-adjust my scatter-plot labels. As I was searching for a solution I came across the adjustText library found here https://github.com/Phlya/adjustText and it seems like it should work, but I'm just trying to find an example that plots from a dataframe. As I tried replicating the adjustText examples it throws me an error So this is my current code.
df["category"] = df["category"].astype(int)
df2 = df.sort_values(by=['count'], ascending=False).head()
ax = df.plot.scatter(x="category", y="count")
a = df2['category']
b = df2['count']
texts = []
for xy in zip(a, b):
texts.append(plt.text(xy))
adjust_text(texts, arrowprops=dict(arrowstyle="->", color='r', lw=0.5))
plt.title("Count of {column} in {table}".format(**sql_dict))
But then I got this TypeError: TypeError: text() missing 2 required positional arguments: 'y' and 's' This is what I tried to transform it from to pivot the coordinates, it works but coordinates just overlap.
df["category"] = df["category"].astype(int)
df2 = df.sort_values(by=['count'], ascending=False).head()
ax = df.plot.scatter(x="category", y="count")
a = df2['category']
b = df2['count']
for xy in zip(a, b):
ax.annotate('(%s, %s)' % xy, xy=xy)
As you can see here I'm getting my df constructed from tables in sql and I'll provide you what this specific table should look like here. In this specific table it's length of stay in days compared to how many people stayed that long. So as a sample of the data may look like. I made a second datframe above so I would label only the highest values on the plot. This is one of my first experiences with graphing visualizations in python so any help would be appreciated.
[![picture of a graph of overlapping items][1]][1]
[los_days count]
3 350
1 4000
15 34
and so forth. Thanks so much. Let me know if you need anything else.
Here is an example of the df
category count
0 2 29603
1 4 33980
2 9 21387
3 11 17661
4 18 10618
5 20 8395
6 27 5293
7 29 4121