0

I've a dataset like this:

1 1 0.5378291300966559
1 2 0.5536607043661815
2 2 0.5524941673147428
1 3 0.5736584823908455
2 3 0.5759360071103211
3 3 0.5874347294745028
1 4 0.5926563715142762
2 4 0.5928230196644817
3 4 0.5994333962893011
4 4 0.6093211865348295
1 5 0.6073769581157649
2 5 0.6092100877680258
3 5 0.6138206865903788
4 5 0.6182646372625263
5 5 0.6275413842906343

The goal is to plot out a heatmap of the values where the first 2 columns are the axis and the 3rd is the value.

I've read them out so that it fits into the dataframe a pivoted it:

data_str = """1 1 0.5378291300966559
1 2 0.5536607043661815
2 2 0.5524941673147428
1 3 0.5736584823908455
2 3 0.5759360071103211
3 3 0.5874347294745028
1 4 0.5926563715142762
2 4 0.5928230196644817
3 4 0.5994333962893011
4 4 0.6093211865348295
1 5 0.6073769581157649
2 5 0.6092100877680258
3 5 0.6138206865903788
4 5 0.6182646372625263
5 5 0.6275413842906343""".split('\n')

import pandas as pd


data = [{'min':line.split()[0], 'max':line.split()[1], 'score':line.split()[2]} for line in data_str]
df = pd.DataFrame(data, dtype=float).pivot('min', 'max', 'score')

When I tried out the solution on https://stackoverflow.com/a/59173863/610569, it only showed a straight line like:

enter image description here

But what I am expecting is for it to plot out the triangle heatmap of the values I have in the score column. How should I go about the plotting that?

alvas
  • 115,346
  • 109
  • 446
  • 738

2 Answers2

1

The function name is get_lower_tri_heatmap which will be the lower tri, in your df

df#upper tri 
Out[101]: 
max       1.0       2.0       3.0       4.0       5.0
min                                                  
1.0  0.537829  0.553661  0.573658  0.592656  0.607377
2.0       NaN  0.552494  0.575936  0.592823  0.609210
3.0       NaN       NaN  0.587435  0.599433  0.613821
4.0       NaN       NaN       NaN  0.609321  0.618265
5.0       NaN       NaN       NaN       NaN  0.627541

Try pass df.T to the function

get_lower_tri_heatmap(df.T)
BENY
  • 317,841
  • 20
  • 164
  • 234
0

I think you should first define an empty numpy array before assigning the values into it. Should look something like this:

import matplotlib.pyplot as plt
import numpy as np
a = np.zeros((5, 5))
t = """1 1 0.5378291300966559
1 2 0.5536607043661815
2 2 0.5524941673147428
1 3 0.5736584823908455
2 3 0.5759360071103211
3 3 0.5874347294745028
1 4 0.5926563715142762
2 4 0.5928230196644817
3 4 0.5994333962893011
4 4 0.6093211865348295
1 5 0.6073769581157649
2 5 0.6092100877680258
3 5 0.6138206865903788
4 5 0.6182646372625263
5 5 0.6275413842906343"""
for line in t.splitlines():
    a[int(line.split()[0]) - 1][int(line.split()[1]) - 1] = line.split()[2]
plt.imshow(a, cmap='hot', interpolation='nearest')
plt.show()
Gustasvs
  • 41
  • 1
  • 6