Making a Scatter Plot from a DataFrame in Pandas

Question

I have a DataFrame and need to make a scatter-plot from it.

I need to use 2 columns as the x-axis and y-axis and only need to plot 2 rows from the entire dataset. Any suggestions?

For example, my dataframe is below (50 states x 4 columns). I need to plot 'rgdp_change' on the x-axis vs 'diff_unemp' on the y-axis, and only need to plot for the states, "Michigan" and "Wisconsin".

dataframe

This is a pretty general question. Can you provide some code or a description - seeing your df would be helpful. If I understand you correctly, you want a scatterplot, but of only two rows. So your scatterplot will only have two points? https://stackoverflow.com/help/how-to-ask — Derek O, May 01 '20 at 00:51
For future reference - [how to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). — BigBen, May 01 '20 at 02:06

Derek O · Answer 1 · 2021-12-21T02:42:27.573

So from the dataframe, you'll need to select the rows from a list of the states you want: ['Michigan', 'Wisconsin']

I also figured you would probably want a legend or some way to differentiate one point from the other. To do this, we create a colormap assigning a different color to each state. This way the code is generalizable for more than those two states.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as colors

# generate a random df with the relevant rows, columns to your actual df
df = pd.DataFrame({'State':['Alabama', 'Alaska', 'Michigan', 'Wisconsin'], 'real_gdp':[1.75*10**5, 4.81*10**4, 2.59*10**5, 1.04*10**5],
'rgdp_change': [-0.4, 0.5, 0.4, -0.5], 'diff_unemp': [-1.3, 0.4, 0.5, -11]})

fig, ax = plt.subplots()
states = ['Michigan', 'Wisconsin']
colormap = cm.viridis
colorlist = [colors.rgb2hex(colormap(i)) for i in np.linspace(0, 0.9, len(states))]

for i,c in enumerate(colorlist):
    x = df.loc[df["State"].isin(['Michigan', 'Wisconsin'])].rgdp_change.values[i]
    y = df.loc[df["State"].isin(['Michigan', 'Wisconsin'])].diff_unemp.values[i]
    legend_label = states[i]

    ax.scatter(x, y, label=legend_label, s=50, linewidth=0.1, c=c)

ax.legend()
plt.show()

Needs axis names: 'rgdp_change' (x-axis), 'diff_unemp' (y-axis) — smci, Dec 21 '21 at 02:48

score 0 · Answer 2 · answered May 01 '20 at 01:42

0

Use the dataframe plot method, but first filter the sates you need using index isin method:

states =  ["Michigan", "Wisconsin"]
df[df.index.isin(states)].plot(kind='scatter', x='rgdp_change', y='diff_unemp')

answered May 01 '20 at 01:42

jcaliz

3,891
2
9
13

Making a Scatter Plot from a DataFrame in Pandas

2 Answers2