-1

I have a dataset with numerous columns: age, income, brand, product, revenue, expenditure, etc.

I want to plot two of the variables, age and income, against each other on the X and Y-axis, and then I want to differentiate the choice of marker based on two variables, brand and product.

I have been able to do a scatter with different colours for different brands using lmplpt from seaborn.

At the moment, all the markers on my graph are all circles that differ by colour depending on which brand they represent. For example, brand A is red, brand B is green, brand C is blue etc.

Is there a way how the style of the marker will differ depending on the product. For example, product 1 uses an X, product 2 uses a O, product 3 uses -.

Therefore, my scatter plot would show product 1 from brand A as a red X, and product 3 from brand B as a green -.

import pandas as pd
import seaborn as sns

df = pd.read_csv("data.csv")
sns.lmplot(x='age', y='income', height = 8, aspect = 1, data=df, fit_reg=False, hue='brand', legend = True)
Redox
  • 9,321
  • 5
  • 9
  • 26
  • Does it need to be `lmplot`? You can do it easily using `style` in seaborn `scatterplot` like this - `sns.scatterplot(data=df, x="age", y="income", hue="brand", style="product")`. Refer [here](https://seaborn.pydata.org/generated/seaborn.scatterplot.html) – Redox Feb 19 '23 at 11:25
  • Thank you! That has worked perfectly! Is there a way to move the location of the legend? Ideally off the main grid. – InnovatingWaters Feb 19 '23 at 11:50
  • If you are on seaborn v0.11.2 or later, you can use `move_legend()`. Refer [here](https://stackoverflow.com/questions/30490740/move-legend-outside-figure-in-seaborn-tsplot) for an example – Redox Feb 19 '23 at 11:55
  • Thank you very much for that! My final question, is there anyway to plot multiple scattergraphs in one frame. So, if I have another variable say country, and there are 9 such countries, to have 9 scatterplots in one frame with one graph for each country. – InnovatingWaters Feb 19 '23 at 12:00

1 Answers1

1

Based on all the comments, I believe you are looking for creating:

  • Multiple scatter plots, one subplot for each country
  • The plot should have shape, color and x, y to indicate each point
  • Legend should be outside the plot area

For this you will need to use a figure level plot - relplot() with kind="scatter" which is its default value anyway. So, you will need something like this.

sns.relplot(data=df, x="age", y="income", hue="brand", style='product', col='country', kind="scatter")

As I dont have your data, used the penguins dataset. Below is my code and resulting plot. Hope this is what you area looking for. Note that with relplot(), the legend is outside, so you dont need to use move_legend()

penguins=sns.load_dataset('penguins')
sns.relplot(data=penguins, x="bill_length_mm", y="bill_depth_mm", hue="species", style='sex', col='island')

Plot

enter image description here

Redox
  • 9,321
  • 5
  • 9
  • 26