pandas df columns series

Question

Have dataframe, and I have done some operations with its columns as follows

df1=sample_data.sort_values("Population")
df2=df1[(df1.Population > 500000) & (df1.Population < 1000000)]
df3=df2["Avg check"]*df2["Avg Daily Rides Last Week"]/df2["CAC"]
df4=df2["Avg check"]*df2["Avg Daily Rides Last Week"]
([[df3],[df4]])

If I understand right, then df3 & df4 now are series only, not dataframe. There should be a way to make a new dataframe with these Series and to plot scatter. Please advise. Thanks.

Wanted to add annotate for each and faced the issue

df3=df2["Avg check"]*df2["Avg Daily Rides Last Week"]/df2["CAC"]
df4=df2["Avg check"]*df2["Avg Daily Rides Last Week"]
df5=df2["Population"]
df6=df2["city_id"]
sct=plt.scatter(df5,df4,c=df3, cmap="viridis")
plt.xlabel("Population")
plt.ylabel("Avg check x Avg Daily Rides")
for i, txt in enumerate(df6):
plt.annotate(txt,(df4[i],df5[i]))
plt.colorbar()
plt.show()

jezrael · Accepted Answer · 2020-05-19T07:25:50.760

1

I think you can pass both Series to matplotlib.pyplot.scatter:

import matplotlib.pyplot as plt
sc = plt.scatter(df3, df4)

EDIT: Swap df5 and df4 and for select by positions use Series.iat:

for i, txt in enumerate(df6):
    plt.annotate(txt,(df5.iat[i],df4.iat[i]))

edited May 19 '20 at 07:25

answered May 19 '20 at 05:59

jezrael

822,522
95
1,334
1,252

Many thanks! What if I need to add the label to identify dots + somewhat colormap to add one more metric ? – Evgeny Petrov May 19 '20 at 06:21
@EvgenyPetrov - You are welcome, check [this](https://stackoverflow.com/questions/17682216/scatter-plot-and-color-mapping-in-python) – jezrael May 19 '20 at 06:22
Many thanks. Would it be possible to see what's wrong with annotate function in my code ? – Evgeny Petrov May 19 '20 at 06:55
@EvgenyPetrov - what is your code? Can you edit question, or better create new one? – jezrael May 19 '20 at 06:57
Could not include the code in this comment block and added in the main content above – Evgeny Petrov May 19 '20 at 06:58
@EvgenyPetrov - there was swapped `df5` and `df4`, also added `iat` for select by positions, edited answer. – jezrael May 19 '20 at 07:20
Thanks ! so iat is somewhat like a function from the library that I have missed ? – Evgeny Petrov May 19 '20 at 07:25
@EvgenyPetrov - exactly, added to answer. – jezrael May 19 '20 at 07:26

TiTo · Answer 2 · 2020-06-03T09:06:29.547

You can create a DataFrame from Series. Here is how to do it. Simply add both series in a dictionary

author = ['Jitender', 'Purnima', 'Arpit', 'Jyoti'] 
article = [210, 211, 114, 178] 

auth_series = pd.Series(author) 
article_series = pd.Series(article) 

frame = { 'Author': auth_series, 'Article': article_series }

and then create a DataFrame from that dictionary:

result = pd.DataFrame(frame)

The code is from geeksforgeeks.org

pandas df columns series

2 Answers2