2

I have 2D data with string labels in a dataframe:

df = pd.DataFrame(data, columns = ['dim1', 'dim2', 'label'])

The labels are strings that have an ordering e.g 'small', 'small-medium, 'medium', 'medium-big', 'big' (simplified for the purposes of the question).

I would like to plot my data on a scatterplot in such a way so that the colors reflect the ordering (so I'm going to use to a perceptually uniform sequential colormap).

Currently, here's what I have, which just plots the datapoints and colors them based on their labels:

groups = df.groupby('label')

fig = plt.figure(figsize=[20, 20])
ax = fig.add_subplot(111)

for name, group in groups:
    ax.plot(group.dim1, group.dim2, label=name, marker='o', linestyle='', markersize=12)
ax.legend(fontsize=20)

How can I adjust the code so that it does what I want?

An Ignorant Wanderer
  • 1,322
  • 1
  • 10
  • 23
  • I don't understand what do you want. You want the legend to be odered as *'small', 'small-medium, 'medium', 'medium-big', 'big'* like https://i.stack.imgur.com/9xgKG.png? – Ynjxsjmh Aug 11 '20 at 04:32
  • @Ynjxsjmh well no not just the legend (but I do want it to be ordered as well). The colors that correspond to the different categories from big to small, I'd like them to be ordered. Take a look at https://matplotlib.org/3.1.1/gallery/color/colormap_reference.html Let's say I choose the "inferno" colormap. I'd like, for example, for big to correspond to the leftmost part of the color spectrum, while small to the rightmost – An Ignorant Wanderer Aug 11 '20 at 15:14

1 Answers1

0

Just specify the order to plot the datapoints so that the legend label is ordered.

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd


data = {'dim1':  range(1, 7),
        'dim2': range(11, 17),
        'label': [ 'small', 'small-medium', 'medium', 'medium-big', 'big', 'small']
        }

df = pd.DataFrame(data, columns = ['dim1', 'dim2', 'label'])

groups = df.groupby('label')

fig = plt.figure(figsize=[20, 20])
ax = fig.add_subplot(111)

labels = ['small', 'small-medium', 'medium', 'medium-big', 'big']
labels.reverse()
colors = plt.get_cmap('inferno').colors
step = len(colors) // len(labels)


for i, label in enumerate(labels):
    for name, group in groups:
        if label == name:
            ax.plot(group.dim1, group.dim2, label=name, marker='o', linestyle='', markersize=12, color=colors[i*step])

ax.legend(fontsize=20)

plt.show()

enter image description here

I use a naive version of getting elements from a list evenly, for more you could refer Select N evenly spaced out elements in array, including first and last.

Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52