I need to create a scatterplot of a dictionary of DNA sequence IDs and molecular weights. Many of the DNA sequences are ambiguous, so they can have many possible molecular weights (and thus there are many values per key). The dictionary looks something like this but many of the keys actually have far more values (I've removed some for the sake of brevity).
{'seq_7009': [6236.9764, 6279.027699999999,
6319.051799999999, 6367.049999999999],
'seq_418': [3716.3642000000004, 3796.4124000000006],
'seq_9143_unamb': [4631.958999999999],
'seq_2888': [5219.3359, 5365.4089],
'seq_1101': [4287.7417, 4422.8254]}
I have another function called get_all_weights
that generates this dictionary, so I'm trying to call that function and then graph the results. This is what I have so far, based on another post on this site, but it doesn't work:
import matplotlib.pyplot as plt
import itertools
def graph_weights(file_name):
with open (file_name) as file:
d = {} # Initialize a dictionary and then fill it with the results of the get_all_weights function
d.update(get_all_weights(file_name))
for k, v in d.items():
x = [key for (key,values) in b.items() for _ in range(len(values))]
y = [val for subl in d.values() for val in subl]
ax.plot(x, y)
plt.show()
Does anyone know how I can achieve this? The plot should show the sequence IDs on the x axis and the values on the y axis and it should make it clear that the same value can occur multiple times.