1

I am attempting to plot 7 subplots (there are 7 categories the data is sorted into) with multiple data sets. My function currently works for one set of data and I have attempted to iterate over args*. When I do this it appears that both sets of data are not being added. How can I iterate over each arg/dictionary (data set values) and dump them into each of the 7 plots with different colors?

def plot_histo(*args,num_bins):
    #print(dic_data)
    fig = plt.figure()
    fig, axes = plt.subplots(2, 4, figsize=(14,6))
    axes = axes.ravel()
    for arg in args:
        for i, (key, value) in enumerate(arg.items()):
            #print(i, key, value)
            axes[i].hist(value,num_bins,stacked= True, alpha=0.5)
            axes[i].set(title=key.upper(), xlabel='R-Resistance',ylabel='N')
            #axes[i].legend()
        plt.tight_layout()
        plt.savefig('wafer_probe_results.png')
        plt.show()
p_histo = plot_histo(w21_results,w22_results,num_bins=10)   

Note the args are dictionaries where I only care about the value. The code for a single set of data is as follows.

def plot_histo(dic_data,num_bins):
    #print(dic_data)
    fig = plt.figure()
    fig, axes = plt.subplots(2, 4, figsize=(14,6))
    axes = axes.ravel()
        for i, (key, value) in enumerate(dic_data.items()):
            #print(i, key, value)
            axes[i].hist(value,num_bins,stacked= True, alpha=0.5)
            axes[i].set(title=key.upper(), xlabel='R-Resistance',ylabel='N')
            #axes[i].legend()
        plt.tight_layout()
        plt.savefig('wafer_probe_results.png')
        plt.show()   
p_histo = plot_histo(w21_results,num_bins=10)

Where dic_data is the dictionary:

dic_data ={'CPW': array([4.2, 4.1, 4.3, 4.3, 4.2, 4.3, 4.1, 4.2, 4.2, 4.1, 4.2, 4.2, 4. ,
       4.1, 4.3, 4.1, 4.1, 4.4, 4.1, 4.2, 4.3, 4.1, 4.1, 4.1, 4.2, 4.2,
       4.4, 4.2]),'CPW to Ground': array([33333., 99960., 99900., 33323., 99950., 99900., 99990., 99950.,
       99890., 99990., 99930., 99900., 99990., 99930., 99890., 49980.,
       99940., 99890., 99960., 99930., 99890., 99910., 99910., 99900.,
       99910., 99900., 99890., 99890.]),etc...}

I am really close, any help or tips are appreciated. subplots

Edit/Update:

#function plots histograms
def plot_histo(*args,num_bins):
    #print(args)
    datasets = (args)
    #print(datasets)
    fig = plt.figure()
    fig, axes = plt.subplots(2, 4, figsize=(14,6))
    axes = axes.ravel()
    categories = args[0].keys()
    for axes,key in zip(axes,categories):
        values = [dictionary[key] for dictionary in args]
        axes.hist(values, num_bins, label=names)
    plt.tight_layout()
p_histo = plot_histo(w21_results,w22_results,w24_results,w23_results,num_bins=10)
plt.savefig('wafer_probe_results.png')
plt.show()
p_histo = plot_histo(w21_results,w22_results,w24_results,w23_results,num_bins=10)
plt.savefig('wafer_probe_results.png')
plt.show()

gives the correct amount of graphs, correct amount of lines, but is not reading or storing the data right

enter image description here

astro_coder
  • 45
  • 1
  • 6
  • Also there are a total of 6 dictionaries that will be passed at arguments – astro_coder Jan 05 '22 at 18:53
  • 1
    any reason not to loop over dataset and call plot_histo()? u can create a 3rd argument to pass in dataset id? like `for id, d in enumerate(dataset): plot_histo(d, bin, id)` so ur figure will be stored with appropriate dataset id. – uhmas Jan 05 '22 at 19:30
  • 1
    ?? You have multiple dictionaries each with the same keys?? For key `"CPW"` you want a single histogram using the values from all the dictionaries? – wwii Jan 05 '22 at 20:18
  • related: [Plot two histograms on single chart with matplotlib](https://stackoverflow.com/questions/6871201/plot-two-histograms-on-single-chart-with-matplotlib) – wwii Jan 05 '22 at 20:41
  • 1
    Please better describe your undesired results as your description is not too clear: *both sets of data are not being added*. Do note you are saving all runs to same named .png file, so only last one may reflect. – Parfait Jan 05 '22 at 21:54
  • [The histogram (hist) function with multiple data sets](https://matplotlib.org/stable/gallery/statistics/histogram_multihist.html) from the Matplotlib docs. – wwii Jan 05 '22 at 22:29
  • @Parfait I have coded for one dictionary (contains 7 arrays) and I have a total of 6 data sets so 6 dictionaries. There’s a total of 7 graphs in the subplot because each of the 6 dictionaries contains the same categorical data sets. I am try to adjust my code to take in each dictionary and add those values to their separate plots and also be distinct. My undesired result right now with the code that has been adjusted to accept multiple dictionaries is that the values were not being added to their individual plots and if they are they aren’t being shown as distinct sets of data – astro_coder Jan 06 '22 at 14:12
  • @SamHu I believe this is what I’m looking to do I just don’t understand where this would go in the scheme of what I have and what should I get rid of? What would be the 3rd argument? Can you explain what Id, d are? Which is the 3rd? – astro_coder Jan 06 '22 at 14:17
  • `but is not reading or storing the data right` - that is not very helpful, what is the problem? – wwii Jan 08 '22 at 04:03

1 Answers1

2

Iterate over keys first then dicts. The outer loop establishes the category being processed and the axes to plot on. The inner loop (list comprehension) extracts the same category/key from each dataset/dict. Use the method shown in this answer to place multiple histograms on each plot. (apologies for any errors, I don't have numpy or matplotlib to test this solution)

# relavent imports

d = dict(zip('abc',[[1,1,1],[2,2,2],[3,3,3]]))
e = dict(zip('abc',[[11,11,11],[22,22,22],[33,33,33]]))
datasets = (d,e)
dataset_names = [str(n) for n in range(len(datasets))]
bins = 10

fig = plt.figure()
fig, axes = plt.subplots(2, 4, figsize=(14,6))
axes = axes.ravel()

for index,key in enumerate(datasets[0]):
    values = [dictionary[key] for dictionary in datasets]
    axes[index].hist(values, bins, label=dataset_names)

Minor simplification to alleviate indexing into a list:

axes = axes.ravel()
for ax,key in zip(axes,datasets[0]):
    values = [dictionary[key] for dictionary in datasets]
    ax.hist(values, bins, label=dataset_names)

A function similar to yours would be

def f(*args,num_bins):
    fig = plt.figure()
    fig, axes = plt.subplots(2, 4, figsize=(14,6))
    axes = axes.ravel()
    categories = args[0].keys()
    for ax,key in zip(axes,categories):
        values = [dictionary[key] for dictionary in args]
        ax.hist(values, bins, label=dataset_names)
    plt.tight_layout()
    plt.savefig('wafer_probe_results.png')
    plt.show()

f(*datasets,num_bins=bins)

Fake data:

import random
n_datasets = 6   
n_categories = 7 
datasets = []    
for _ in range(n_datasets):
    data = [[random.choice(range(7)) for _ in range(50)] for _ in range(n_categories)]
    datasets.append(dict(zip('abcdefghijklmnop',data)))
wwii
  • 23,232
  • 7
  • 37
  • 77