1

Have dataframe, and I have done some operations with its columns as follows

df1=sample_data.sort_values("Population")
df2=df1[(df1.Population > 500000) & (df1.Population < 1000000)]
df3=df2["Avg check"]*df2["Avg Daily Rides Last Week"]/df2["CAC"]
df4=df2["Avg check"]*df2["Avg Daily Rides Last Week"]
([[df3],[df4]])

If I understand right, then df3 & df4 now are series only, not dataframe. There should be a way to make a new dataframe with these Series and to plot scatter. Please advise. Thanks.

Wanted to add annotate for each and faced the issue

df3=df2["Avg check"]*df2["Avg Daily Rides Last Week"]/df2["CAC"]
df4=df2["Avg check"]*df2["Avg Daily Rides Last Week"]
df5=df2["Population"]
df6=df2["city_id"]
sct=plt.scatter(df5,df4,c=df3, cmap="viridis")
plt.xlabel("Population")
plt.ylabel("Avg check x Avg Daily Rides")
for i, txt in enumerate(df6):
plt.annotate(txt,(df4[i],df5[i]))
plt.colorbar()
plt.show()

2 Answers2

1

I think you can pass both Series to matplotlib.pyplot.scatter:

import matplotlib.pyplot as plt
sc = plt.scatter(df3, df4)

EDIT: Swap df5 and df4 and for select by positions use Series.iat:

for i, txt in enumerate(df6):
    plt.annotate(txt,(df5.iat[i],df4.iat[i]))
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

You can create a DataFrame from Series. Here is how to do it. Simply add both series in a dictionary

author = ['Jitender', 'Purnima', 'Arpit', 'Jyoti'] 
article = [210, 211, 114, 178] 

auth_series = pd.Series(author) 
article_series = pd.Series(article) 

frame = { 'Author': auth_series, 'Article': article_series } 

and then create a DataFrame from that dictionary:

result = pd.DataFrame(frame) 

The code is from geeksforgeeks.org

TiTo
  • 833
  • 2
  • 7
  • 28