1

I need to scatter plot a dictionary of dictionary.

My data looks like this :

{'+C': {1: 191, 2: 557}, '+B': None, '-B': None, '+D': None, '+N': {1: 1, 3: 1}, '+L': {1: 2819, 2: 1506}, '>L': None, '<C': {0: 2125}, '<B': None, '<L': {0: 2949, 1: 2062}}

The outer keys are x axis labels and inside keys are y axis. The inside keys' values are annotations for x,y. I tried plotting the data but didn't get the graph I was looking for.

I tried the following but ended up having repetitions in my x axis.

for action, value in action_sequence.items():
        if value:
            for seq,count in value.items():
                data["x"].append(action)
                data["y"].append(seq)
                data["label"].append(count)
        else:
            data["x"].append(action)
            data["y"].append(-1)
            data["label"].append(0)

print(data)

plt.figure(figsize=(10,8))
plt.title('Scatter Plot', fontsize=20)
plt.xlabel('x', fontsize=15)
plt.ylabel('y', fontsize=15)
plt.xticks(range(len(data["x"])), data["x"])
plt.scatter(range(len(data["x"])), data["y"], marker = 'o')
blackmamba
  • 1,952
  • 11
  • 34
  • 59

1 Answers1

0

You need to set up integer categories for x which you can then assign labels on the x axis. The code below uses the action codes in your dict for this. Note that dictionaries in python are unordered, so the keys need to be sorted. You could instead use an ordered dict, and then use the dict.keys() directly.

The annotations for each point are strings which are placed one at a time with text annotations on the plot. We need to explicitly set the x,y axis ranges on the plot with plt.axis() since annotations aren't included when computing the ranges automatically.

import matplotlib.pyplot as plt

action_sequence = {
    '+C': {1: 191, 2: 557}, '+B': None, '-B': None, '+D': None, 
    '+N': {1: 1, 3: 1}, '+L': {1: 2819, 2: 1506}, 
    '>L': None, '<C': {0: 2125}, '<B': None, '<L': {0: 2949, 1: 2062}
}

# x data are categorical; define a lookup table mapping category string to int
x_labels = list(sorted(action_sequence.keys()))
x_values = list(range(len(x_labels)))
lookup_table = dict((v,k) for k,v in enumerate(x_labels))

# Build a list of points (x, y, annotation) defining the scatter plot.
points = [(lookup_table[action], key, anno)
      for action, values in action_sequence.items()
      for key, anno in (values if values else {}).items()]
x, y, anno = zip(*points)

# y is also categorical, with integer labels for the categories
y_values = list(range(min(y), max(y)+1))
y_labels = [str(v) for v in y_values]

plt.figure(figsize=(10,8))
plt.title('Scatter Plot', fontsize=20)
plt.xlabel('x', fontsize=15)
plt.ylabel('y', fontsize=15)
plt.xticks(x_values, x_labels)
plt.yticks(y_values, y_labels)
plt.axis([min(x_values)-0.5, max(x_values)+0.5, 
          min(y_values)-0.5, max(y_values)+0.5])
#plt.scatter(x, y, marker = 'o')
for x_k, y_k, anno_k in points:
    plt.text(x_k, y_k, str(anno_k))

plt.show()

See the following question for a different way of labeling the scatter plot:

Matplotlib: How to put individual tags for a scatter plot

Community
  • 1
  • 1
Neapolitan
  • 2,101
  • 9
  • 21