I am trying to create a 2D array so I can create a heatmap using matplotlib.pyplot
similar to the example here: A simple categorical heatmap
I have looked at solutions here How to select rows from a DataFrame based on column values? and here Return single cell value from Pandas DataFrame, but I cannot get them to work for my purpose.
here is my code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
age = np.unique(ageVehicle['Age'])
vehicle = np.unique(ageVehicle['_Vehicle_Type'])
ageVehicleType = np.array([])
innerList = np.array([])
for i in age:
for j in vehicle:
if len(innerList) == len(vehicle) - 1:
innerList+=(int(ageVehicle.loc[(ageVehicle['_Vehicle_Type'] == j) & (ageVehicle['Age'] == i)]['_Count(vehicle_Type)'].values))
ageVehicleType.append(innerList)
innerList = np.array([])
break
else:
innerList+=(int(ageVehicle.loc[(ageVehicle['_Vehicle_Type'] == j) & (ageVehicle['Age'] == i)]['_Count(vehicle_Type)'].values))
fig, ax = plt.subplots()
im = ax.imshow(ageVehicleType)
# We want to show all ticks...
ax.set_xticks(np.arange(len(vehicle)))
ax.set_yticks(np.arange(len(age)))
# ... and label them with the respective list entries
ax.set_xticklabels(vehicle)
ax.set_yticklabels(age)
# Rotate the tick labels and set their alignment.
plt.setp(ax.get_xticklabels(), rotation=45, ha="right",
rotation_mode="anchor")
fig.tight_layout()
plt.show()
My dataframe ageVehicle
has 3 columns: Age
, _Vehicle_Type
and _Count(vehicle_Type)
. In the nested for loops for i ... for j:
I am basically trying to build 1D arrays innerList
which will be combined together in a 2D array ageVehicleType
. age
and vehicle
lists contain the unique values of age and vehicle in my ageVehicle
dataframe.
for example:
age = [8,9,10,11,12,13,14,15,16]
vehicle = ['toyota', 'bmw', 'mazda', 'benz', 'tesla']
_Count(vehicle_Type)
is how many of each combinations of age and vehicle there are.
The 2D array ageVehicleType
will essentially be all possible combinations of age
and vehicle
on dataframe ageVehicle
. This 2D array will be the values to construct the colors on the heatmap.
Questions:
The more important question is that I already have the counts (to use for coloring cells on heatmap) in one of the columns
_Count(vehicle_Type
. Is it possible, to somehow use this column in myageVehicle
dataframe to build the heatmap instead of creating the 2D array which constitutes all combinations ofage
andvehicle
?Should the 2D array
ageVehicleType
necessarily be a cross-product of all combinations ofage
andvehicle
? If so, the logic of the code may need to be altered.I am getting an error. I'd appreciate your help on how I can re-write my conditions to resolve this issue:
TypeError Traceback (most recent call last)
<ipython-input-54-4e2a48f8339f> in <module>
15 else:
16 innerList+=(int(ageVehicle.loc[(ageVehicle['_Vehicle_Type'] == j) & (ageVehicle['Age'] == i)]\
---> 17 ['_Count(vehicle_Type)'].values))
18
TypeError: only size-1 arrays can be converted to Python scalars
Thanks in advance.