0

I'm working with a dataset regarding the survivors on the Titanic, where I'm trying to show the relationship between Age of passengers and the fare they paid.

This is what the data is currently formatted as: Titanic Passenger data format: head

from here, it was fairly easy to make a simple scatterplot, like so: Scatterplot showing the relationship between Age and Fare

However, I am curious as to if there is a way to set the color of some of the points to be different based on the sex from the dataset. Most examples I have seen across the internet focus on how to change the color for two separate data sets. I initially tried to use an if statement to change the color depending on sex, but that didn't work for me the way I hoped it would.

DejaVuMan
  • 35
  • 1
  • 6
  • Dupkicare of https://stackoverflow.com/questions/12236566/setting-different-color-for-each-series-in-scatter-plot-on-matplotlib – ev-br Sep 26 '20 at 19:25

2 Answers2

1

Perhaps much easier with :

import seaborn as sns

data = sns.load_dataset('titanic')
sns.scatterplot('age', 'fare', data=data, hue='sex')

enter image description here

BigBen
  • 46,229
  • 7
  • 24
  • 40
  • Thank you! This seems to be a pretty easy thing to do in seaborn, wasn't aware such a solution existed. – DejaVuMan Sep 26 '20 at 19:53
0

One potential solution I came up to after pondering a bit could potentially look like this as well:

enter image description here

The problem with this solution is you have to add more variables, which isn't ideal, and the results stack over each other a bit making it harder to see the data trends.

DejaVuMan
  • 35
  • 1
  • 6