1

I have a dataframe that contains my x and y data (converted later into lists), and for each x,y I have three attributes that are percentages (all add up to 100%), so, for example, for x,y the first attribute is 0.05, the second is 0.45 and the third is 0.5.

         pc1       pc2      %_h        %_s      %_c  
0    -2.319093 -4.058990  0.718839  0.074559  0.206602  
1     1.514446 -2.324842  0.552632  0.157895  0.289474  
2    -2.431196 -1.938358  0.440313  0.071755  0.487932  
3    -2.642250 -1.001307  0.707883  0.058733  0.233385 
4    -1.486477 -2.537368  0.617834  0.151956  0.230209  
5    -1.990138 -3.457012  0.326633  0.088358  0.585008 
6    -0.844124 -3.081770  0.550000  0.113636  0.336364  
7    -2.376568 -1.471469  0.663071  0.196066  0.140863  
8    -3.139226  0.451762  0.696914  0.056173  0.246914  
    :

My aim is to make some sort of heatmap that can plot x(pc1), y(pc2) and the extent for each attribute, so darker sections would mean there is a higher density of a particular attribute. Below is the heatmap using:

plt.figure(figsize = (16,16))
plt.hist2d(pc1, pc2, bins=50, cmap=plt.cm.jet)
plt.show()

plot But it's based on the density of x,y points, but I want it to be based on my 3 attributes, ie red regions are for those with a high %h/%s/%c value

My research says its something to do with gridding the data and working out the frequency for each cell to get some sort of z dimensions. I've tried: Plotting a 2D heatmap with Matplotlib and: Generate a heatmap in MatPlotLib using a scatter data set (this one is based on only 2 dimensions) and have had no luck.

I would also like to plot a line graph for the density of each of the three attributes for each x.y using something like the sns.jointmap feature

EDIT What I think the approach would be is to plot a heatmap that represents the first attribute and then layer on on the other attributes after another. But I am not sure how to set the extent for the heatmap to represent the attribute and not the density counts of x,y?

EDIT This is after attempting the suggested code.

plot2

Haj Sai
  • 291
  • 3
  • 13

1 Answers1

0

Have you tried pcolormesh? Example:https://matplotlib.org/3.1.1/gallery/images_contours_and_fields/pcolormesh_levels.html

fig, ax = plt.subplots() 
data = np.array([df['%_h'],df['%_s'],df['%_c']]).reshape(3,len(df))
cax =ax.pcolormesh(df.index, np.arange(0,3), data, shading = 'gouraud', cmap = 'jet')
fig.colorbar(cax)
P K
  • 1
  • 2
  • Can you share partial of your dataframe including x and y? – P K Aug 08 '19 at 16:12
  • i've added this to the original q: x,y is pc1,pc2 – Haj Sai Aug 08 '19 at 19:51
  • Ah i see. Try the code in my edited answer. In the plot you will get, the x axis is each (x,y) entry (each row in the dataframe), y axis is the 3 attributes, and the color spectrum is how "strong" the attribute value is. – P K Aug 09 '19 at 15:51
  • Thank you for this, not quite what I was looking for. I was hoping to retain the original shape seen in my plot, so the x axis is pc1 and the y axis is pc2, and maybe layer the plots for each attribute in some way.. – Haj Sai Aug 09 '19 at 16:20
  • Then you can either remove the reshape completely or change to reshape(3,6912). – P K Aug 09 '19 at 16:22
  • I've added the resulting plot that I get.. I removed the reshape and still get the length of the df rows on the x axis instead of the pc1 column values – Haj Sai Aug 09 '19 at 16:47
  • May I also ask where '6912' is from? Thank you – Haj Sai Aug 09 '19 at 16:57