0

enter image description hereI'm new to python and am trying to create a heatmap within a heatmap - in other words, use one matrix as the foundational matrix and show which combinations are present or missing in smaller dataframes.

Here is an example using some random numbers:

np.random.seed(7)
data = {'a':[1,2,3,4,5,6,7,8,9,10], 
        'b':[1,2,3,4,5,6,7,8,9,10]}
df = pd.DataFrame(data)
#This is the master matrix that has all values


df_2 = pd.DataFrame(np.random.randint(10, size = 100))
#This is the df from which I draw median/mean/etc
matrix = pd.crosstab(df['a'], df['b'], values=df_2, aggfunc='median')
sns.heatmap(matrix) #This is the generated heatmap; note it is 10 by 10

The problem happens when I try to take only certain values from df_2; my heatmap shrinks and I don't see all values.

a = df_2[df_2 > 7]
matrix_2 = pd.crosstab(df['a'], df['b'], values=a, aggfunc='median')
sns.heatmap(matrix_2) #Resulting matrix is now 3 by 3.

How can I make the resulting 4 by 4 matrix appear on the master matrix, i.e., showing all the missing values ?

Any help would be greatly appreciated!

Tazboy
  • 11
  • 1
  • 1
    Could you add some example data (4 values in a 2x2 matrix possibly are sufficient) and a more detailed description (or approximate image) of the needed output? Now your incomplete data seem to suggest just one matrix. Will this be the larger or the smaller matrix? Does larger and smaller mean an outer and multiple inner matrices? Or ....? – JohanC Jan 09 '21 at 17:48
  • Hi, thank you: I am having some trouble describing it. First matrix is, for example, 5*5 (so it has 5 values on the y axis, 5 on the x axis, and an additional median value such as temperature). The second matrix is taken from within this larger matrix, but doesn't have all the values; for example, it has 2,4 and 5 and but 1 and 3. So matrix 1 is 5 * 5 but matrix 2 is 3 * 3. When I try to plot this using seaborn, it makes matrix 2 3 * 3 but I cannot overlay them - this is important because I want to show missing values. This is analogous to using "hue" for categories in other sns plots. – Tazboy Jan 10 '21 at 20:59
  • Thank you for your patience! I have drawn it out above. I hope this makes it clear--using sns.heatmap(), I lost the "empty" values which are important in showing what's not there! – Tazboy Jan 12 '21 at 22:26
  • Lovely drawing, and I feel we are getting closer. But I am afraid it is still not clear to me. What are these missing values, and why are there two categories of missing values? Maybe a toy dataset with code to reproduce the undesired outcome would help. Also, would you like then the output to be three heatmaps in your example or should it be one heatmap with color coding or similar indicating that there are three categories? – Mr. T Jan 13 '21 at 01:55
  • Okay! I added a random data set, although I couldn't figure out how to make it a 'real' heatmap. My actual data set, which is large, would have values more in line with the hand-drawn sketch. – Tazboy Jan 13 '21 at 14:23
  • I finally had time to look at it. And I am afraid I still don't know what you want to achieve. Either the same cell has to show [more than one value](https://stackoverflow.com/a/64362551/8881141) or the color code already tells us that df2 > 7 in which case you might want to define a [segmented colormap](https://stackoverflow.com/q/38836154/8881141). – Mr. T Jan 15 '21 at 21:23

0 Answers0