1

I have a list of tuples and want to create a scatterplot based on certain requirements. The data set looks like this:

data = [(0.7,48000,1),(1.9,48000,0),(2.5,60000,1),(4.2,63000,0) ...]

The data represents -> Tenure- number of years employed Salary- the employers salary Account- this is a number that specifies 1(paid account) or 0(unpaid)

Using matplotlib I want to create a scatter plot that shows years(x-axis), and salary(y-axis), and whether or not the account is paid or not.

I have the first part of the scatter plot which is amount of years and the salary with the following code: '''

years, salary, account = zip(*data)

plt.scatter(years, salary)

plt.title('Data-Science Club Members')
plt.xlabel('Years as Data Scientist')
plt.ylabel('Salary')
plt.legend()
plt.show()

'''

Here is how my graph look:

My Graph

Here is how I am trying to make my graph look like:

Wanted Graph

I am new to stackoverflow so please I am sorry if this is asked properly I tried providing as much info as possible. Thank you!

  • 1
    Use `import seaborn as sns` and then `sns.scatterplot(x=years, y='salary', style='account')`, or use `hue='account'` instead of `style='account'`, or use both `hue` and `style`. – Trenton McKinney Sep 11 '21 at 22:16
  • Example plot `sns.scatterplot(data=sns.load_dataset('tips'), x='total_bill', y='tip', style='sex', hue='sex')`. `seaborn` is a high-level API for `matplotlib`. – Trenton McKinney Sep 11 '21 at 22:22
  • `sns.scatterplot(x=years, y=salary, style=account)` sorry, no quotes, since you're passing lists. – Trenton McKinney Sep 11 '21 at 22:24
  • 1
    Awesome that worked @TrentonMcKinney do you happen to know how I can display the legend using seaborn that will say 'Unpaid' and 'Paid'? And how to change the default display settings of the seaborn graph to my choice? Like green circles or blue + signs? – Dante Zelaya Sep 11 '21 at 22:25
  • @Zephyr some of the duplicates aren't using pandas and show multiple options using both seaborn and matplotlib. – Trenton McKinney Sep 11 '21 at 22:25
  • Like you mentioned seaborn is a high-level API and I never used it so I just found out I do not have to use 'sns.show()' it will do it automatically which is cool! – Dante Zelaya Sep 11 '21 at 22:26
  • `account = [{0: 'unpaid', 1: 'paid'}[v] for v in account]` this will replace 0 with unpaid, and 1 with paid, in `account`. Then use the new `account` for hue and style. Or you can use the `map` function: `account = map({0: 'unpaid', 1: 'paid'}.get, account)` – Trenton McKinney Sep 11 '21 at 22:32
  • Thank you @TrentonMcKinney that worked and now it is looking awesome! I'm not sure why the question got marked as duplicate as those other ones are not dealing with tuple lists with 3 elements in each tuple. But I got it answered so thanks! – Dante Zelaya Sep 11 '21 at 22:55
  • This [answer](https://stackoverflow.com/a/59076863/7758804) from the dup starts with lists using seaborn. But that question asks the same thing, using lists. You're question was really 2 questions. How to convert the list to different values, and then the plotting part. Questions without a complete [mre] often get closed as a duplicate, or closed for lack of information, since a question without an [mre] isn't really reproducible. – Trenton McKinney Sep 11 '21 at 22:58
  • Since this `OP` started with `years, salary, account = zip(*data)`, we are plotting lists, because you've already shown the data extracted from a list of tuples, into a list. – Trenton McKinney Sep 11 '21 at 23:03
  • @TrentonMcKinney do you know How i can plot a histogram from this data? I want to be able to show a histogram that will show the number of unpaid accounts in one color, and the paid accounts in a different. BUT they are on the same bin. [HERE](https://www.google.com/url?sa=i&url=https%3A%2F%2Fwww.chegg.com%2Fhomework-help%2Fquestions-and-answers%2Fproblem-need-create-histogram-plot-histogram-plot-allows-us-discover-underlying-frequency--q84011313&psig=AOvVaw3-v9rrEZvp_zLM_rncW70v&ust=1631631418494000&source=images&cd=vfe&ved=0CAsQjRxqFwoTCOCx2-6a_PICFQAAAAAdAAAAABAD) is an example – Dante Zelaya Sep 13 '21 at 14:58
  • `sns.histplot(x=years, hue=account, multiple='stack')` or `sns.histplot(x=salary, hue=account, multiple='stack')` – Trenton McKinney Sep 13 '21 at 15:35
  • @TrentonMcKinney I get this error when I trying that `AttributeError: module 'seaborn' has no attribute 'histplot'` – Dante Zelaya Sep 13 '21 at 16:02
  • Update seaborn. The current version is 0.11.2. If you’re using Anaconda, at the anaconda prompt do `conda update --all` – Trenton McKinney Sep 13 '21 at 16:50

0 Answers0