0

I have a dataset with two different variables, i want to give colors to each with different color, Can anyone help please? Link to my dataset : "https://github.com/mayuripandey/Data-Analysis/blob/main/word.csv"

import matplotlib.pyplot as plt
import pandas as pd



fig, ax = plt.subplots(figsize=(10, 6))

ax.scatter(x = df['Friends Network-metrics'], y = df['Number of Followers'],cmap = "magma")
plt.xlabel("Friends Network-metrics")
plt.ylabel("Number of Followers")
plt.show()

enter image description here

Noob Coder
  • 202
  • 3
  • 16
  • give a small sample of the df. – chitown88 May 18 '22 at 10:55
  • https://github.com/mayuripandey/Data-Analysis/blob/main/word.csv Here it is. – Noob Coder May 18 '22 at 11:01
  • and which column is your variable to color on? – chitown88 May 18 '22 at 11:02
  • for x = df['Friends Network-metrics'], and y = df['Number of Followers'], i need both x and y to show different colors – Noob Coder May 18 '22 at 11:07
  • yes I know that. But `i want to give colors to each with different color`. What are you wanting to color? Give colors to each (what is "each") with a different color. You aren't being clear on what you are wanting to do. – chitown88 May 18 '22 at 11:08
  • What do you want to color by ? https://stackoverflow.com/questions/21654635/scatter-plots-in-pandas-pyplot-how-to-plot-by-category – Gedas Miksenas May 18 '22 at 11:09
  • 1
    It's a scatter plot of 2 dimensions. x and y define a single point. You can't have more than 1 color for a single point. What do you mean you want x and y to show different colors? – chitown88 May 18 '22 at 11:13
  • are you saying you want a different color for EVERY (2719) data points? Not sure that accomplishes anything in terms of being an effective visual. While yes, visualisations should be aesthetically pleasing, they also should be as simple as possible to be consumed and understood as quickly as possible. Adding a range of colors just to add colors adds unnecessary complexity to a straightforward 2-dimension scatter plot. NOw if you wanted to change colors for say, `"Gender"` or `"Sentiment"`, now that could add value. – chitown88 May 18 '22 at 11:18
  • ok now i got it, i wanted to see if i could give different colors to the variable x and y, didn't realise x and y defines a single point. Thank you for clearing that :) – Noob Coder May 18 '22 at 11:29
  • 1
    I added a solution to give you some options though. I'm thinking you are looking for something like the last 2 images below. – chitown88 May 18 '22 at 11:42

1 Answers1

0

Not very clear what you want to do here. But I'll provide a solution that may help you a bit.

Could use seaborn to implement the colors on the variables. Otherwise, you'd need to iterate through the points to set the color. Or create a new column that conditionally inputs a color for a value.

I don't know what your variable is, but you just want to put that in for the hue parameter:

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

df = pd.read_csv('https://raw.githubusercontent.com/mayuripandey/Data-Analysis/main/word.csv')

# Use the 'hue' argument to provide a factor variable
sns.lmplot(x='Friends Network-metrics', 
           y='Number of Followers', 
           height=8,
           aspect=.8,
           data=df, 
           fit_reg=False, 
           hue='Sentiment', 
           legend=True)

plt.xlabel("Friends Network-metrics")
plt.ylabel("Number of Followers")
 
plt.show()

This can give you a view like this:

enter image description here

If you were looking for color scale for one of the variables though, you would do the below. However, the max value is so big that the range also doesn't make it really an effective visual:

import matplotlib.pyplot as plt
import pandas as pd

df = pd.read_csv('https://raw.githubusercontent.com/mayuripandey/Data-Analysis/main/word.csv')

fig, ax = plt.subplots(figsize=(10, 6))
g = ax.scatter(x = df['Friends Network-metrics'], 
               y = df['Number of Followers'],
               c = df['Friends Network-metrics'],
               cmap = "magma")
fig.colorbar(g)

plt.xlabel("Friends Network-metrics")
plt.ylabel("Number of Followers")
 
plt.show()

enter image description here

So you could adjust the scale (I'd also add edgecolors = 'black' as its hard to see the light plots):

import matplotlib.pyplot as plt
import pandas as pd

df = pd.read_csv('https://raw.githubusercontent.com/mayuripandey/Data-Analysis/main/word.csv')

fig, ax = plt.subplots(figsize=(10, 6))
g = ax.scatter(x = df['Friends Network-metrics'], 
               y = df['Number of Followers'],
               c = df['Friends Network-metrics'],
               cmap = "magma",
               vmin=0, vmax=10000,
               edgecolors = 'black')
fig.colorbar(g)

plt.xlabel("Friends Network-metrics")
plt.ylabel("Number of Followers")
 
plt.show()

enter image description here

chitown88
  • 27,527
  • 4
  • 30
  • 59